当前位置:   article > 正文

今日arXiv精选 | 13篇EMNLP 2021最新论文

does commonsense help in detecting sarcasm?

8ebd024eb88137ca7fdab98cafaf0e39.png

 关于 #今日arXiv精选 

这是「AI 学术前沿」旗下的一档栏目,编辑将每日从arXiv中精选高质量论文,推送给读者。

Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications

Comment: EMNLP 2021

Link: http://arxiv.org/abs/2109.08627

Abstract

Sentence-level Quality estimation (QE) of machine translation istraditionally formulated as a regression task, and the performance of QE modelsis typically measured by Pearson correlation with human labels. Recent QEmodels have achieved previously-unseen levels of correlation with humanjudgments, but they rely on large multilingual contextualized language modelsthat are computationally expensive and make them infeasible for real-worldapplications. In this work, we evaluate several model compression techniquesfor QE and find that, despite their popularity in other NLP tasks, they lead topoor performance in this regression setting. We observe that a full modelparameterization is required to achieve SoTA results in a regression task.However, we argue that the level of expressiveness of a model in a continuousrange is unnecessary given the downstream applications of QE, and show thatreframing QE as a classification problem and evaluating QE models usingclassification metrics would better reflect their actual performance inreal-world applications.

Adversarial Scrubbing of Demographic Information for Text Classification

Comment: Accepted at EMNLP 2021

Link: http://arxiv.org/abs/2109.08613

Abstract

Contextual representations learned by language models can often encodeundesirable attributes, like demographic associations of the users, while beingtrained for an unrelated target task. We aim to scrub such undesirableattributes and learn fair representations while maintaining performance on thetarget task. In this paper, we present an adversarial learning framework"Adversarial Scrubber" (ADS), to debias contextual representations. We performtheoretical analysis to show that our framework converges without leakingdemographic information under certain conditions. We extend previous evaluationtechniques by evaluating debiasing performance using Minimum Description Length(MDL) probing. Experimental evaluations on 8 datasets show that ADS generatesrepresentations with minimal information about demographic attributes whilebeing maximally informative about the target task.

Does Commonsense help in detecting Sarcasm?

Comment: Accepted at Insights from Negative Results in NLP Workshop, EMNLP  2021

Link: http://arxiv.org/abs/2109.08588

Abstract

Sarcasm detection is important for several NLP tasks such as sentimentidentification in product reviews, user feedback, and online forums. It is achallenging task requiring a deep understanding of language, context, and worldknowledge. In this paper, we investigate whether incorporating commonsenseknowledge helps in sarcasm detection. For this, we incorporate commonsenseknowledge into the prediction process using a graph convolution network withpre-trained language model embeddings as input. Our experiments with threesarcasm detection datasets indicate that the approach does not outperform thebaseline model. We perform an exhaustive set of experiments to analyze wherecommonsense support adds value and where it hurts classification. Ourimplementation is publicly available at:https://github.com/brcsomnath/commonsense-sarcasm.

Mitigating Data Scarceness through Data Synthesis, Augmentation and Curriculum for Abstractive Summarization

Comment: To appear in proceedings of EMNLP 2021 (https://2021.emnlp.org/)

Link: http://arxiv.org/abs/2109.08569

Abstract

This paper explores three simple data manipulation techniques (synthesis,augmentation, curriculum) for improving abstractive summarization modelswithout the need for any additional data. We introduce a method of datasynthesis with paraphrasing, a data augmentation technique with sample mixing,and curriculum learning with two new difficulty metrics based on specificityand abstractiveness. We conduct experiments to show that these three techniquescan help improve abstractive summarization across two summarization models andtwo different small datasets. Furthermore, we show that these techniques canimprove performance when applied in isolation and when combined.

Exploring Multitask Learning for Low-Resource AbstractiveSummarization

Comment: To appear in proceedings of EMNLP 2021 (https://2021.emnlp.org/)

Link: http://arxiv.org/abs/2109.08565

Abstract

This paper explores the effect of using multitask learning for abstractivesummarization in the context of small training corpora. In particular, weincorporate four different tasks (extractive summarization, language modeling,concept detection, and paraphrase detection) both individually and incombination, with the goal of enhancing the target task of abstractivesummarization via multitask learning. We show that for many task combinations,a model trained in a multitask setting outperforms a model trained only forabstractive summarization, with no additional summarization data introduced.Additionally, we do a comprehensive search and find that certain tasks (e.g.paraphrase detection) consistently benefit abstractive summarization, not onlywhen combined with other tasks but also when using different architectures andtraining corpora.

Conversational Multi-Hop Reasoning with Neural Commonsense Knowledge and Symbolic Logic Rules

Comment: Appearing in the 2021 Conference on Empirical Methods in Natural  Language Processing (EMNLP)

Link: http://arxiv.org/abs/2109.08544

Abstract

One of the challenges faced by conversational agents is their inability toidentify unstated presumptions of their users' commands, a task trivial forhumans due to their common sense. In this paper, we propose a zero-shotcommonsense reasoning system for conversational agents in an attempt to achievethis. Our reasoner uncovers unstated presumptions from user commands satisfyinga general template of if-(state), then-(action), because-(goal). Our reasoneruses a state-of-the-art transformer-based generative commonsense knowledge base(KB) as its source of background knowledge for reasoning. We propose a noveland iterative knowledge query mechanism to extract multi-hop reasoning chainsfrom the neural KB which uses symbolic logic rules to significantly reduce thesearch space. Similar to any KBs gathered to date, our commonsense KB is proneto missing knowledge. Therefore, we propose to conversationally elicit themissing knowledge from human users with our novel dynamic question generationstrategy, which generates and presents contextualized queries to human users.We evaluate the model with a user study with human users that achieves a 35%higher success rate compared to SOTA.

Simple Entity-Centric Questions Challenge Dense Retrievers

Comment: EMNLP 2021. The code and data is publicly available at  https://github.com/princeton-nlp/EntityQuestions

Link: http://arxiv.org/abs/2109.08535

Abstract

Open-domain question answering has exploded in popularity recently due to thesuccess of dense retrieval models, which have surpassed sparse models usingonly a few supervised training examples. However, in this paper, we demonstratecurrent dense models are not yet the holy grail of retrieval. We firstconstruct EntityQuestions, a set of simple, entity-rich questions based onfacts from Wikidata (e.g., "Where was Arve Furset born?"), and observe thatdense retrievers drastically underperform sparse methods. We investigate thisissue and uncover that dense retrievers can only generalize to common entitiesunless the question pattern is explicitly observed during training. We discusstwo simple solutions towards addressing this critical problem. First, wedemonstrate that data augmentation is unable to fix the generalization problem.Second, we argue a more robust passage encoder helps facilitate better questionadaptation using specialized question encoders. We hope our work can shed lighton the challenges in creating a robust, universal dense retriever that workswell across different input distributions.

Neural Unification for Logic Reasoning over Natural Language

Comment: Accepted at EMNLP2021 Findings

Link: http://arxiv.org/abs/2109.08460

Abstract

Automated Theorem Proving (ATP) deals with the development of computerprograms being able to show that some conjectures (queries) are a logicalconsequence of a set of axioms (facts and rules). There exists severalsuccessful ATPs where conjectures and axioms are formally provided (e.g.formalised as First Order Logic formulas). Recent approaches, such as (Clark etal., 2020), have proposed transformer-based architectures for derivingconjectures given axioms expressed in natural language (English). Theconjecture is verified through a binary text classifier, where the transformersmodel is trained to predict the truth value of a conjecture given the axioms.The RuleTaker approach of (Clark et al., 2020) achieves appealing results bothin terms of accuracy and in the ability to generalize, showing that when themodel is trained with deep enough queries (at least 3 inference steps), thetransformers are able to correctly answer the majority of queries (97.6%) thatrequire up to 5 inference steps. In this work we propose a new architecture,namely the Neural Unifier, and a relative training procedure, which achievesstate-of-the-art results in term of generalisation, showing that mimicking awell-known inference procedure, the backward chaining, it is possible to answerdeep queries even when the model is trained only on shallow ones. The approachis demonstrated in experiments using a diverse set of benchmark data.

A Role-Selected Sharing Network for Joint Machine-Human Chatting Handoff and Service Satisfaction Analysis

Comment: 11 pages, 4 figures, accepted by the main conference of EMNLP 2021

Link: http://arxiv.org/abs/2109.08412

Abstract

Chatbot is increasingly thriving in different domains, however, because ofunexpected discourse complexity and training data sparseness, its potentialdistrust hatches vital apprehension. Recently, Machine-Human Chatting Handoff(MHCH), predicting chatbot failure and enabling human-algorithm collaborationto enhance chatbot quality, has attracted increasing attention from industryand academia. In this study, we propose a novel model, Role-Selected SharingNetwork (RSSN), which integrates both dialogue satisfaction estimation andhandoff prediction in one multi-task learning framework. Unlike prior effortsin dialog mining, by utilizing local user satisfaction as a bridge, globalsatisfaction detector and handoff predictor can effectively exchange criticalinformation. Specifically, we decouple the relation and interaction between thetwo tasks by the role information after the shared encoder. Extensiveexperiments on two public datasets demonstrate the effectiveness of our model.

To be Closer: Learning to Link up Aspects with Opinions

Comment: Accepted as a long paper in the main conference of EMNLP 2021

Link: http://arxiv.org/abs/2109.08382

Abstract

Dependency parse trees are helpful for discovering the opinion words inaspect-based sentiment analysis (ABSA). However, the trees obtained fromoff-the-shelf dependency parsers are static, and could be sub-optimal in ABSA.This is because the syntactic trees are not designed for capturing theinteractions between opinion words and aspect words. In this work, we aim toshorten the distance between aspects and corresponding opinion words bylearning an aspect-centric tree structure. The aspect and opinion words areexpected to be closer along such tree structure compared to the standarddependency parse tree. The learning process allows the tree structure toadaptively correlate the aspect and opinion words, enabling us to betteridentify the polarity in the ABSA task. We conduct experiments on fiveaspect-based sentiment datasets, and the proposed model significantlyoutperforms recent strong baselines. Furthermore, our thorough analysisdemonstrates the average distance between aspect and opinion words areshortened by at least 19% on the standard SemEval Restaurant14 dataset.

CodeQA: A Question Answering Dataset for Source Code Comprehension

Comment: Findings of EMNLP 2021

Link: http://arxiv.org/abs/2109.08365

Abstract

We propose CodeQA, a free-form question answering dataset for the purpose ofsource code comprehension: given a code snippet and a question, a textualanswer is required to be generated. CodeQA contains a Java dataset with 119,778question-answer pairs and a Python dataset with 70,085 question-answer pairs.To obtain natural and faithful questions and answers, we implement syntacticrules and semantic analysis to transform code comments into question-answerpairs. We present the construction process and conduct systematic analysis ofour dataset. Experiment results achieved by several neural baselines on ourdataset are shown and discussed. While research on question-answering andmachine reading comprehension develops rapidly, few prior work has drawnattention to code question answering. This new dataset can serve as a usefulresearch benchmark for source code comprehension.

Distilling Linguistic Context for Language Model Compression

Comment: EMNLP 2021. Code: https://github.com/GeondoPark/CKD

Link: http://arxiv.org/abs/2109.08359

Abstract

A computationally expensive and memory intensive neural network lies behindthe recent success of language representation learning. Knowledge distillation,a major technique for deploying such a vast language model in resource-scarceenvironments, transfers the knowledge on individual word representationslearned without restrictions. In this paper, inspired by the recentobservations that language representations are relatively positioned and havemore semantic knowledge as a whole, we present a new knowledge distillationobjective for language representation learning that transfers the contextualknowledge via two types of relationships across representations: Word Relationand Layer Transforming Relation. Unlike other recent distillation techniquesfor the language models, our contextual distillation does not have anyrestrictions on architectural changes between teacher and student. We validatethe effectiveness of our method on challenging benchmarks of languageunderstanding tasks, not only in architectures of various sizes, but also incombination with DynaBERT, the recently proposed adaptive size pruning method.

Self-training with Few-shot Rationalization: Teacher Explanations Aid Student in Few-shot NLU

Comment: To Appear in EMNLP 2021

Link: http://arxiv.org/abs/2109.08259

Abstract

While pre-trained language models have obtained state-of-the-art performancefor several natural language understanding tasks, they are quite opaque interms of their decision-making process. While some recent works focus onrationalizing neural predictions by highlighting salient concepts in the textas justifications or rationales, they rely on thousands of labeled trainingexamples for both task labels as well as an-notated rationales for everyinstance. Such extensive large-scale annotations are infeasible to obtain formany tasks. To this end, we develop a multi-task teacher-student frameworkbased on self-training language models with limited task-specific labels andrationales, and judicious sample selection to learn from informativepseudo-labeled examples1. We study several characteristics of what constitutesa good rationale and demonstrate that the neural model performance can besignificantly improved by making it aware of its rationalized predictions,particularly in low-resource settings. Extensive experiments in severalbench-mark datasets demonstrate the effectiveness of our approach.

·

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/不正经/article/detail/345542?site
推荐阅读
相关标签
  

闽ICP备14008679号