赞
踩
论文:https://arxiv.org/abs/2310.19070
代码:https://github.com/tzjtatata/Myriad
Myriad这个名字取得有格局
What Goal:
现有的工业异常检测(IAD)方法可以预测异常检测和定位的异常分数。然而,它们很难对异常区域进行多轮对话和详细描述,例如工业异常的颜色、形状和类别。#(替换为线圈的xxx)
最近,大型多模态(即视觉和语言)模型(LMM)在图像描述、视觉理解、视觉推理等多种视觉任务上表现出了卓越的感知能力,使其成为更易于理解的异常检测的有竞争力的潜在选择。然而,现有的通用 LMM 中缺乏有关异常检测的知识,而训练特定的 LMM 进行异常检测需要大量的注释数据和大量的计算资源。(计算资源?)
本文提出了一种新颖的大型多模态模型,通过应用视觉专家进行工业异常检测(称为Myriad),从而实现明确的异常检测和高质量的异常描述。
贡献:
① We propose a novel large vision-language model by applying vision experts for industrial anomaly detection termed by Myriad. By incorporating arbitrary existing IAD models as vision experts to provide prior knowledge, Myriad produces comprehensive descriptions of IAD task.
② We design Vision Expert Tokenizer to make LMMs can receive prior knowledge. An Expert-Driven Vision-Language Extraction module is further proposed to extract domain-specific vision-language representation to further achieve accurate anomaly detection.
③ Extensive experiments show that our proposed Myriad can receive prior knowledge from vision experts and further outperforms state-of-the-art methods on both MVTec-AD [1] and VisA [39] benchmarks.
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。