赞
踩
© 作者|都一凡
机构|中国人民大学高瓴人工智能学院
研究方向 | 预训练模型
ICLR是人工智能领域顶级会议之一,会议主题包括深度学习、统计和数据科学,以及一些重要的应用,例如:计算机视觉、计算生物学、语音识别、文本理解、游戏和机器人等。
ICLR 2023将于2023年5月1日至5月5日在卢旺达基加利举行。由于官方的论文接受列表尚未公开,因此本文从投稿论文中选取了与自然语言处理相关的100多篇论文,按照不同的研究主题进行了分类整理,以供参考。
ICLR 2023投稿论文链接如下:https://openreview.net/group?id=ICLR.cc/2023/Conference。
目录
模型
文本生成
机器翻译
对话与问答
知识与推理
多模态
信息检索
代码
数学
知识蒸馏
表示学习
可解释性
鲁棒性
其他任务
Benchmark
1. 模型
1.1 模型结构
EIT: Enhanced Interactive Transformer for Sequence Generation
Transformers with Multiresolution Attention Heads
SaMoE: Parameter Efficient MoE Language Models via Self-Adaptive Expert Combination
Sparse MoE with Random Routing as the New Dropout: Training Bigger and Self-Scalable Models
1.2 模型训练
Guess the Instruction! Making Language Models Stronger Zero-Shot Learners
LEXA: Language-agnostic Cross-consistency Training for Question Answering Tasks
CCT: Cross-consistency training for Clone Detection and Code Search Tasks
Large Language Models Can Self-improve
Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning
PMixUp: Simultaneous Utilization of Part-of-Speech Replacement and Feature Space Interpolation for Text Data Augmentation
Self-Consistent Learning: Cooperation between Generators and Discriminators
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
Toward Adversarial Training on Contextualized Language Representation
ContraGen: Effective Contrastive Learning For Causal Language Model
Language Model Pre-training with Linguistically Motivated Curriculum Learning
MLM with Global Co-occurrence
Improving Language Model Pretraining with Text Structure Information
Learning by Distilling Context
MAT: Mixed-Strategy Game of Adversarial Training in Fine-tuning
Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks
1.3 模型使用
Prompt Injection: Parameterization of Fixed Inputs
Meta-Weighted Language Model Tuning for Augmentation-Enhanced Few-Shot Learning
Pre-trained Language Models can be Fully Zero-Shot Learners
KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Low-Resource NLP
Contrastive Novelty Learning: Anticipating Outliers with Large Language Models
Model ensemble instead of prompt fusion: a sample-specific knowledge transfer method for few-shot prompt tuning
Mass-Editing Memory in a Transformer
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
Selective Annotation Makes Language Models Better Few-Shot Learners
Generate rather than Retrieve: Large Language Models are Strong Context Generators
Ahead-of-Time P-Tuning
Can discrete information extraction prompts generalize across language models?
2. 文本生成
Dynamic Scheduled Sampling with Imitation Loss for Neural Text Generation
DiffusER: Diffusion via Edit-based Reconstruction
MVP: Multi-task Supervised Pre-training for Natural Language Generation
Penalizing the High-likelihood: A Novel Sampling Method for Open-ended Neural Text Generation via Inverse Probability Weighting
RainProof: An Umbrella to Shield Text Generator from Out-Of-Distribution Data
A Non-monotonic Self-terminating Language Model
PromptSum: Planning with Mixed Prompts for Parameter-Efficient Controllable Abstractive Summarization
On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation
Joint Generator-Ranker Learning for Natural Language Generation
Calibrating Sequence likelihood Improves Conditional Language Generation
Sequence to sequence text generation with diffusion models
Tailoring Language Generation Models under Total Variation Distance
Language Models Can See: Plugging Visual Controls in Text Generation
Distribution Aware Metrics for Conditional Natural Language Generation
PEER: A Collaborative Language Model
3. 机器翻译
Seq2Seq Pre-training with Dual-channel Recombination for Translation
Simple and Scalable Nearest Neighbor Machine Translation
Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation
4. 对话与问答
Towards Boosting the Open-Domain Chatbot with Human Feedback
Learning Locality and Isotropy in Dialogue Modeling
Knowledge-Consistent Dialogue Generation with Language Models and Knowledge Graphs
Complex-Target-Guided Open-Domain Conversation based on offline reinforcement learning
5. 知识与推理
ReAct: Synergizing Reasoning and Acting in Language Models
Language model with Plug-in Knowldge Memory
Thrust: Adaptively Propels Large Language Models with External Knowledge
Self-Consistency Improves Chain of Thought Reasoning in Language Models
DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
Neuro-Symbolic Procedural Planning with Commonsense Prompting
Multimodal Analogical Reasoning over Knowledge Graphs
ThinkSum: Probabilistic reasoning over sets using large language models
Joint Representations of Text and Knowledge Graphs for Retrieval and Evaluation
Rethinking Identity in Knowledge Graph Embedding
gGN: learning to represent nodes in directed graphs as low-rank Gaussian distributions
Don't Throw Your Old Policies Away: Knowledge-based Policy Recycling Protects Against Adversarial Attacks
Measuring and Narrowing the Compositionality Gap in Language Models
6. 多模态
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
CLIP model is an Efficient Continual Learner
Language Modelling with Pixels
Visual Classification via Description from Large Language Models
Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning
RelationCLIP: Training-free Fine-grained Visual and Language Concept Matching
Contrastive Prompt Tuning Improves Generalization in Vision-Language Models
Masked Vision and Language Modeling for Multi-modal Representation Learning
UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks
Visually-augmented pretrained language models for NLP Tasks without Images
Music-to-Text Synaesthesia: Generating Descriptive Text from Music Recordings
VLG: General Video Recognition with Web Textual Knowledge
Dynamic Historical Adaptation for Continual Image-Text Modeling
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models
NÜWA-LIP: Language-guided Image Inpainting with Defect-free VQGAN
Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Language-Guided Artistic Style Transfer Using the Latent Space of DALL-E
Unified Vision and Language Prompt Learning
DrML: Diagnosing and Rectifying Vision Models using Language
MaPLe: Multi-modal Prompt Learning
Prefix Conditioning Unifies Language and Label Supervision
Domain-Unified Prompt Representations for Source-Free Domain Generalization
Learning to Decompose Visual Features with Latent Textual Prompts
Delving into the Openness of CLIP
Cali-NCE: Boosting Cross-modal Video Representation Learning with Calibrated Alignment
Dynamic Historical Adaptation for Continual Image-Text Modeling
Design of the topology for contrastive visual-textual alignment
7. 信息检索
Multi-Vector Retrieval as Sparse Alignment
Augmenting Zero-shot Dense Retrievers With Plug-in Mixture-of-memories
CAMVR: Context-Adaptive Multi-View Representation Learning for Dense Retrieval
8. 代码
Language Models Can Teach Themselves to Program Better
Repository-Level Prompt Generation for Large Language Models of Code
NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual Question Answering
A Simple, Yet Effective Approach to Finding Biases in Code Generation
Deep Learning-based Source Code Complexity Prediction
FixEval: Execution-based Evaluation of Program Fixes for Competitive Programming Problems
InCoder: A Generative Model for Code Infilling and Synthesis
Code Translation with Compiler Representations
CodeT: Code Generation with Generated Tests
Multi-lingual Evaluation of Code Generation Models
9. 数学
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions
Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning
10. 知识蒸馏
Speed Up Iterative Non-Autoregressive Transformers by Distilling Multiple Steps
A comparison of dataset distillation and active learning in text classification
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
Distilling Text-Image Foundation Models
11. 表示学习
RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank
Neural Embeddings for Text
Ranking-Enhanced Unsupervised Sentence Representation Learning
Neural Topic Modeling with Embedding Clustering Regularization
Counterfactual Contrastive Learning for Robust Text Classification
On The Inadequacy of Optimizing Alignment and Uniformity in Contrastive Learning of Sentence Representations
12. 可解释性
ORCA: Interpreting Prompted Language Models via Locating Supporting Evidence in the Ocean of Pretraining Data
ContraSim -- A Similarity Measure Based on Contrastive Learning
13. 鲁棒性
Learning from Others: Similarity-based Regularization for Mitigating Artifacts
Randomized Smoothing with Masked Inference for Adversarially Robust NLP Systems
14. 其他任务
Exploring Methods for Parsing Movie Scripts - Feature Extraction for Further Social Injustice Analysis
MSQ-BioBERT: Ambiguity Resolution to Enhance BioBERT Medical Question-Answering
Compositional Semantic Parsing with Large Language Models
AxBERT: An Explainable Chinese Spelling Correction Method Driven by Associative Knowledge Network
BED: Boundary-Enhanced Decoder for Chinese Word Segmentation
Semi-connected Joint Entity Recognition and Relation Extraction of Contextual Entities in Family History Records
15. Benchmark
GuoFeng: A Discourse-aware Evaluation Benchmark for Language Understanding, Translation and Generation
一起交流
想和你一起学习进步!『NewBeeNLP』目前已经建立了多个不同方向交流群(机器学习 / 深度学习 / 自然语言处理 / 搜索推荐 / 图网络 / 面试交流 / 等),名额有限,赶紧添加下方微信加入一起讨论交流吧!(注意一定o要备注信息才能通过)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。