当前位置:   article > 正文

[论文阅读笔记41]关于医学的bert专题_clinicalbert

clinicalbert

论文1:BioBert

论文题目:BioBERT: a pre-trained biomedical language representation model for biomedical text mining
论文地址: https://arxiv.org/abs/1901.08746
项目地址: https://github.com/naver/biobert-pretrained
论文概要:Korea University, 以通用领域预训练bert为初始权重,基于Pubmed上大量医疗领域英文论文训练。在多个医疗相关下游任务中超越SOTA模型的表现。 引用:Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, Jaewoo Kang, BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics, Volume 36, Issue 4, 15 February 2020, Pages 1234–1240, https://doi.org/10.1093/bioinformatics/btz682

论文2:sciBert

论文题目:SCIBERT: A Pretrained Language Model for Scientific Text
论文地址:https://arxiv.org/abs/1903.10676
项目地址:https://github.com/allenai/scibert/
论文概要:AllenAI团队出品.基于Semantic Scholar 上 110万+ 文章训练的科学领域bert
引用: Beltagy I , Lo K , Cohan A . SciBERT: A Pretrained Language Model for Scientific Text[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.

论文3:clinical-bert

论文题目:Publicly Available Clinical BERT Embeddings
论文地址:https://www.aclweb.org/anthology/W19-1909/
项目地址:https://github.com/EmilyAlsentzer/clinicalBERT
论文概要:出自NAACL Clinical NLP Workshop 2019.基于MIMIC-III数据库中的200万份医疗记录训练的临床领域bert.
引用:Alsentzer E , Murphy J R , Boag W , et al. Publicly Available Clinical BERT Embeddings[J]. 2019.

论文4:clinical-bert(另一团队的版本)

论文题目:ClinicalBert: Modeling Clinical Notes and Predicting Hospital Readmission
论文地址:https://arxiv.org/abs/1904.05342
项目地址: https://github.com/kexinhuang12345/clinicalBERT
论文概要:同样基于MIMIC-III数据库,但只随机选取了10万份医疗记录训练的临床领域bert.
引用:Huang K , Altosaar J , Ranganath R . ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission[J].2019.

论文5:BEHRT

论文题目:BEHRT:TRANSFORMER FOR ELECTRONIC HEALTH RECORDS
论文地址: https://arxiv.org/abs/1907.09538
项目地址: https://github.com/deepmedicine/BEHRT
论文概要:牛津大学,这篇论文中embedding是基于医学实体训练,而不是基于单词。
引用:Li Y , Rao S , Solares J , et al. BEHRT: Transformer for Electronic Health Records[J]. Scientific Reports, 2020, 10(1).

论文6:MC-BERT

Conceptualized Representation Learning for Chinese Biomedical Text Mining 2020

医学实体全遮盖,医学短语全遮盖

https://arxiv.org/pdf/2008.10813.pdf
https://github.com/alibabaresearch/ChineseBLUE 

Ningyu Zhang, Qianghuai Jia, Kangping Yin, Liang Dong, Feng Gao, and Nengwei Hua. 2020. Conceptualized Representation Learning for Chinese Biomedical Text Mining. In WSDM ’20: , February 3–7, 2020, Houston. ACM, New York, NY, USA, 4 pages. 

论文7:MT-BERT 【BlueBERT】

https://arxiv.org/pdf/2005.02799.pdf

An Empirical Study of Learning on BERT for Biomedical Text Mining

结合下游任务文本相似、关系抽取、推理、NER的多任务学习,共享BERT参数

数据集:https://arxiv.org/pdf/1906.05474.pdf

https://github.com/ncbi-nlp/BLUE_Benchmark

论文8:BERT-MK

Integrating Graph Contextualized Knowledge into Pre-trained Language Models

https://arxiv.org/pdf/1912.00147.pdf

来自于华为和中科大,其主要关注于如何将上下文有关的知识信息加入到预训练模型里。

通过知识图谱三元组学习实体表示

整合实体表示到PTM训练,将知识图谱内的信息融合到预训练模型中,指导模型参数学习

论文9:UmlsBERT

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

论文地址:https://aclanthology.org/2021.naacl-main.139/

论文10

Self-Alignment Pretraining for Biomedical Entity Representations

论文地址:https://aclanthology.org/2021.naacl-main.334/

论文11

Are we there yet? Exploring clinical domain knowledge of BERT models

论文地址:https://aclanthology.org/2021.bionlp-1.5/

论文12

Stress Test Evaluation of Biomedical Word Embeddings

论文地址:https://aclanthology.org/2021.bionlp-1.13/

论文13:BioELECTRA

BioELECTRA:Pretrained Biomedical text Encoder using Discriminators

论文地址:https://aclanthology.org/2021.bionlp-1.16/

论文14

Improving Biomedical Pretrained Language Models with Knowledge

论文地址:https://aclanthology.org/2021.bionlp-1.20/

论文15:EntityBERT

EntityBERT: Entity-centric Masking Strategy for Model Pretraining for the Clinical Domain

论文地址:https://aclanthology.org/2021.bionlp-1.21/

论文16

ChicHealth @ MEDIQA 2021: Exploring the limits of pre-trained seq2seq models for medical summarization

论文地址:https://aclanthology.org/2021.bionlp-1.29/

论文17:PubMedBERT

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

https://arxiv.org/pdf/2007.15779.pdf

论文18 SMedBERT

SMedBERT: A Knowledge-Enhanced Pre-trained Language Model with Structured Semantics for Medical Text Mining

https://aclanthology.org/2021.acl-long.457.pdf

https://github.com/MatNLP/SMedBERT

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/羊村懒王/article/detail/666778
推荐阅读
相关标签
  

闽ICP备14008679号