机器学习医学
数据科学 (Data Science)
Personalized, or precision, medicine has long been a touchstone for what the future of treatment could be. It consists of using knowledge specific to a patient, such as biomarkers, demographics, or lifestyle characteristics, to best treat their ailment, rather than generic, averaged best practices. In the best case, it could leverage the expertise of thousands of practitioners and outcomes from millions of other patients to provide proven, effective care.
长期以来,个性化或精确医学一直是治疗未来的试金石。 它包括使用特定于患者的知识(例如生物标志物,人口统计学或生活方式特征)来最好地治疗他们的疾病,而不是一般的平均最佳实践。 在最好的情况下,它可以利用成千上万从业人员的专业知识和数百万其他患者的结果来提供可靠,有效的护理。
Precision medicine has a number of benefits, both for the patient and physician (adapted from Fröhlich):
精密医学对患者和医师都有很多好处(改编自Fröhlich ):
- Improved treatment efficacy 改善治疗效果
- Reduced adverse effects 减少不良影响
- Lower costs for patient and providers 降低患者和医护人员的费用
- Earlier diagnosis using biomarkers 使用生物标志物的早期诊断
- Improved prognosis estimation 改善预后评估
The amount of raw data needed to make individualized treatment achievable necessitates machine learning. A previous article of mine covers this, and other applications for machine learning in oncology specifically, as it is likely the closest field to achieving realized results with precision medicine.
实现个性化治疗所需的原始数据量需要机器学习。 我的上一篇文章涵盖了这一领域以及肿瘤学中机器学习的其他应用,因为这很可能是通过精密医学获得已实现结果的最接近领域。
Pharmacogenomics is an entire field dedicated to studying how a patient’s genes affect their response to drugs. This is mostly used to limit adverse drug effects due to prolonged exposure or increased sensitivity, but some drugs require specific genes to be expressed for their target to be available. For example, Herceptin (trastuzumad) targets HER2/neu receptors and consequently that gene must be over-expressed in the patient’s cancer for Herceptin to be prescribed.
药物基因组学是一个致力于研究患者基因如何影响其对药物React的整个领域。 这主要用于限制由于长时间暴露或敏感性增加而引起的不良药物影响,但是某些药物需要表达特定的基因才能获得靶标。 例如,赫赛汀(trastuzumad)靶向HER2 / neu受体,因此必须在患者的癌症中过量表达该基因才能开出赫赛汀。
One successful model used genetic data from 800 patients over 10 years at the Mayo Clinic to determine the efficacy of various drugs for easing depressive symptoms. The model was able to achieve 85% accuracy compared to about 55% for psychiatrists, who often have to use trial-and-error with patients to find the most effective drug.
一种成功的模型使用了梅奥诊所10年来800例患者的遗传数据,确定了各种药物缓解抑郁症状的功效。 该模型能够达到85%的准确度,而精神科医生通常要对病人进行反复试验才能找到最有效的药物,而精神科医生的准确率则为55%。
Other potential avenues for improved care include identifying causal genes, understanding phenotypic and genotypic differences among patients, gene-gene interactions, and novel drug discovery. DeepSEA, a deep learning model developed by Princeton University, has had success in predicting chromatin effects of single nucleotide aberrations in DNA. Models such as these have great potential in their predictive success, but they should also illuminate mechanistic relationships between genotypes, disease diagnosis, treatment options, etc.
改善护理的其他潜在途径包括发现病因基因,了解患者之间的表型和基因型差异,基因与基因的相互作用以及新药的发现。 DeepSEA是普林斯顿大学开发的深度学习模型,已成功预测DNA中单核苷酸畸变的染色质效应。 这样的模型在预测成功方面具有巨大潜力,但它们也应阐明基因型,疾病诊断,治疗选择等之间的机制关系。
挑战性 (Challenges)
While these various approaches to precision medicine are exciting and are actively being explored, there are still many challenges to be overcome. You should read this article for a more in depth examination, but some points are summarized below.
尽管这些精密医学的各种方法令人兴奋并正在积极探索中,但仍有许多挑战需要克服。 您应该阅读本文以进行更深入的研究,但是下面总结了一些要点。
First, we often hear “Ah, if we only had more data then our model would be better”, but this does not capture the reality, especially in medicine, that data quality is often passed over in favor of data quantity. Colloquially, garbage in, garbage out. While a lot of data is often necessary, that data must have underlying patterns that can separate noise from signal. As Fröhlich notes, some noise is error from sampling (not useful) and some is biological variation (very useful), but neither we nor our models can differentiate between them.
首先,我们经常听到“啊,如果我们只有更多的数据,那么我们的模型会更好”,但这并不能反映现实,尤其是在医学领域,数据质量通常会被数据量所取代。 通俗地讲,垃圾进,垃圾出。 尽管通常需要大量数据,但是该数据必须具有可以将噪声与信号分离的潜在模式。 正如Fröhlich指出的那样,有些噪声是采样误差(没有用),有些是生物学变异(很有用),但是我们和我们的模型都无法区分它们。
Second, while our models can discover novel patterns in data and even provide very accurate predictions on new examples, they cannot prove causal relationships. It is this specific quality that makes them amazingly useful tools and not a real substitution for the scientific method. Modern media loves to describe “AI” as a replacement for any and every job or task when it simply cannot be. See the less-than-ideal implementation of IBM’s Watson.
其次,虽然我们的模型可以发现数据中的新颖模式,甚至可以对新示例提供非常准确的预测,但它们无法证明因果关系。 正是这种特殊的品质使它们成为了非常有用的工具,而不是科学方法的真正替代。 现代媒体喜欢将“ AI”描述为根本无法替代的任何工作或任务。 请参阅IBM Watson的不太理想的实现。
Third, a predictive model lives by its performance on unseen data. This is the rationale behind train/test splits, cross-validation, data augmentation, etc. If a model does not generalize well then it is not very useful. What makes deploying precision medicine models is that the implications of poor performance can be great. We must be aware of the ethical pitfalls of validating our models with the health of unaware patients. Therefore, expensive clinical trials are still needed to substantiate the model and show how it improves care, if at all.
第三,预测模型以其在看不见的数据上的表现为生。 这是训练/测试拆分,交叉验证,数据扩充等背后的原理。如果模型不能很好地概括,那么它就不是很有用。 部署精确医学模型的原因在于,性能不佳的影响可能很大。 我们必须意识到在不了解患者的情况下验证我们的模型的道德陷阱。 因此,仍然需要进行昂贵的临床试验来证实该模型并显示其如何改善护理(如果有的话)。
未来 (The Future)
Precision medicine has an exciting future for improving care according to a patient’s specific phenotypes and genome. Specific models are achieving success in drug discovery, treatment efficacy, guiding diagnoses, and much more. IBM’s Watson general AI has even been incorporated into a small number of oncology departments around the world, though with mixed results. Indeed, the challenges facing machine learning for precision medicine are varied and obstinate. But hey, creating the future is never easy.
精准医学在根据患者的特定表型和基因组改善护理方面有着令人兴奋的未来。 特定模型在药物发现,治疗功效,指导诊断等方面取得了成功。 IBM的Watson通用AI甚至已被纳入全球少数肿瘤科,尽管结果好坏参半。 确实,针对精确医学的机器学习面临的挑战千差万别。 但是,创造未来绝非易事。
资料来源 (Sources)
[1] Forbes Insights Team, How Machine Learning Is Crafting Precision Medicine (2019), Forbes Insights.
[1]《福布斯见解》团队, 《机器学习如何制作精密医学》 (2019年),《福布斯见解》。
[2] G.Z. Papadakis, A.H. Karantanas, M. Tsiknakis, et al., Deep learning opens new horizons in personalized medicine (2019), Biomedical Reports 10 (4): 215–217.
[2] GZ Papadakis,AH Karantanas,M。Tsiknakis等人,《 深度学习为个性化医学开辟了新视野》 (2019年),《生物医学报告》 10(4):215–217。
[3] M. Uddin, Y. Wang and M. Woodbury-Smith, Artificial intelligence for precision medicine in neurodevelopmental disorders (2019), npj Digital Medicine 2: 112.
[3] M. Uddin,Y。Wang和M.Woodbury-Smith,《 人工智能在神经发育障碍中的精准医学》 (2019年),npj Digital Medicine 2:112。
[4] J. Zhou and O.G. Troyanskaya, Predicting effects of noncoding variants with deep learning-based sequence model (2015), Nature Methods 12: 941–934.
[4] J. Zhou和OG Troyanskaya,《 基于深度学习的序列模型预测非编码变体的效果》 (2015年),《自然方法》 12:941–934。
翻译自: https://towardsdatascience.com/precision-medicine-and-machine-learning-11060caa3065
机器学习医学