[论文阅读笔记41]关于医学的bert专题_clinicalbert

作者：羊村懒王 | 2024-06-03 10:15:39

踩

clinicalbert

论文1：BioBert

论文题目：BioBERT: a pre-trained biomedical language representation model for biomedical text mining
论文地址: https://arxiv.org/abs/1901.08746
项目地址: https://github.com/naver/biobert-pretrained
论文概要：Korea University，以通用领域预训练bert为初始权重，基于Pubmed上大量医疗领域英文论文训练。在多个医疗相关下游任务中超越SOTA模型的表现。引用：Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, Jaewoo Kang, BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics, Volume 36, Issue 4, 15 February 2020, Pages 1234–1240, https://doi.org/10.1093/bioinformatics/btz682

论文2：sciBert

论文题目：SCIBERT: A Pretrained Language Model for Scientific Text
论文地址：https://arxiv.org/abs/1903.10676
项目地址：https://github.com/allenai/scibert/
论文概要：AllenAI团队出品.基于Semantic Scholar 上 110万+ 文章训练的科学领域bert
引用: Beltagy I , Lo K , Cohan A . SciBERT: A Pretrained Language Model for Scientific Text[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.

论文3：clinical-bert

论文题目：Publicly Available Clinical BERT Embeddings
论文地址：https://www.aclweb.org/anthology/W19-1909/
项目地址：https://github.com/EmilyAlsentzer/clinicalBERT
论文概要：出自NAACL Clinical NLP Workshop 2019.基于MIMIC-III数据库中的200万份医疗记录训练的临床领域bert.
引用：Alsentzer E , Murphy J R , Boag W , et al. Publicly Available Clinical BERT Embeddings[J]. 2019.

论文4：clinical-bert(另一团队的版本)

论文题目：ClinicalBert: Modeling Clinical Notes and Predicting Hospital Readmission
论文地址：https://arxiv.org/abs/1904.05342
项目地址: https://github.com/kexinhuang12345/clinicalBERT
论文概要：同样基于MIMIC-III数据库,但只随机选取了10万份医疗记录训练的临床领域bert.
引用：Huang K , Altosaar J , Ranganath R . ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission[J].2019.

论文5：BEHRT

论文题目：BEHRT:TRANSFORMER FOR ELECTRONIC HEALTH RECORDS
论文地址: https://arxiv.org/abs/1907.09538
项目地址: https://github.com/deepmedicine/BEHRT
论文概要：牛津大学,这篇论文中embedding是基于医学实体训练，而不是基于单词。
引用：Li Y , Rao S , Solares J , et al. BEHRT: Transformer for Electronic Health Records[J]. Scientific Reports, 2020, 10(1).

论文6：MC-BERT

Conceptualized Representation Learning for Chinese Biomedical Text Mining 2020

医学实体全遮盖，医学短语全遮盖

https://arxiv.org/pdf/2008.10813.pdf
https://github.com/alibabaresearch/ChineseBLUE

Ningyu Zhang, Qianghuai Jia, Kangping Yin, Liang Dong, Feng Gao, and Nengwei Hua. 2020. Conceptualized Representation Learning for Chinese Biomedical Text Mining. In WSDM ’20: , February 3–7, 2020, Houston. ACM, New York, NY, USA, 4 pages.