赞
踩
1. 中文解读Human Language Understanding & Reasoningz 2. Attention Is All You Need (Transformers)
中文解读 3. Blog Post: The Illustrated Transformer
中文解读 4. HuggingFace's course on Transformers
1. Deep contextualized word representations (ELMo)中文解读 2. Improving Language Understanding by Generative Pre-Training (OpenAI GPT)中文解读 3. RoBERTa: A Robustly Optimized BERT Pretraining Approach
中文解读
4. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
1. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
2. mT5: A massively multilingual pre-trained text-to-text transformer
3. AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
1. Language Models are Unsupervised Multitask Learners (GPT-2)中文解读 2. PaLM: Scaling Language Modeling with Pathways
3. OPT: Open Pre-trained Transformer Language Models
1. Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
2. True Few-Shot Learning with Language Models
3. Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
4. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
1. Factual Probing Is [MASK]: Learning vs. Learning to Recall
2. P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
3. LoRA: Low-Rank Adaptation of Large Language Models
4. Towards a Unified View of Parameter-Efficient Transfer Learning
1. What Makes Good In-Context Examples for GPT-3?
2. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
3. Data Distributional Properties Drive Emergent In-Context Learning in Transformers
4. What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
1. Noisy Channel Language Model Prompting for Few-Shot Text Classification
2. How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering
3. Language Models (Mostly) Know What They Know
1. Explaining Answers with Entailment Trees
2. Self-Consistency Improves Chain of Thought Reasoning in Language Models
3. Faithful Reasoning Using Large Language Models
1. Knowledge Neurons in Pretrained Transformers
2. Fast Model Editing at Scale
3. Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets
1. The Pile: An 800GB Dataset of Diverse Text for Language Modeling
2. Deduplicating Training Data Makes Language Models Better
1. Scaling Laws for Neural Language Models
2. Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
3. Scaling Laws for Autoregressive Generative Modeling
1. Quantifying Memorization Across Neural Language Models
2. Deduplicating Training Data Mitigates Privacy Risks in Language Models
3. Large Language Models Can Be Strong Differentially Private Learners
4. Recovering Private Text in Federated Learning of Language Models
1. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
2. Red Teaming Language Models with Language Models
3. Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
1. Challenges in Detoxifying Language Models
2. Detoxifying Language Models Risks Marginalizing Minority Voices
3. Plug and Play Language Models: A Simple Approach to Controlled Text Generation
4. GeDi: Generative discriminator guided sequence generation
1. Efficient Large Scale Language Modeling with Mixtures of Experts
2. Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
3. A Review of Sparse Expert Models in Deep Learning
1. Generalization through Memorization: Nearest Neighbor Language Models
2. Training Language Models with Memory Augmentation
3. Few-shot Learning with Retrieval Augmented Language Models
1. Learning to summarize from human feedback
2. Fine-Tuning Language Models from Human Preferences
3. MemPrompt: Memory-assisted Prompt Editing with User Feedback
4. LaMDA: Language Models for Dialog Application
1. A Conversational Paradigm for Program Synthesis
2. InCoder: A Generative Model for Code Infilling and Synthesis
3. A Systematic Evaluation of Large Language Models of Code
4. Language Models of Code are Few-Shot Commonsense Learners
5. Competition-Level Code Generation with AlphaCode
1. Blog post: Generalized Visual Language Models
2. Learning Transferable Visual Models From Natural Language Supervision (CLIP)
3. Multimodal Few-Shot Learning with Frozen Language Models
4. CM3: A Causal Masked Multimodal Model of the Internet
1. Multitask Prompted Training Enables Zero-Shot Task Generalization
2. PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
3. Scaling Instruction-Finetuned Language Models
4. Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
1. A General Language Assistant as a Laboratory for Alignment
2. Alignment of Language Agents
3. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback