涵盖了图像分类(ResNet)、医学影像分割(U-Net3D)、目标物体检测(SSD)、目标物体检测(Mask R-CNN)、语音识别(RNN-T)、自然语言理解(BERT)、智能推荐(DLRM)以及强化机器学习(Minigo)8类AI应用场景。
SUT:被测系统(System under test)
mAP:平均精度均值(Mean average precision)
mIoU:平均交并比(Mean intersection over union)
FPS:每秒帧率(Frame per second)
FAR:误识率(False accept rate)
FRR:拒识率(False reject rate)
IR:识别正确率(Identification rate)
WER:词错误率(Word error rate)
SER:句错误率(Sentence error rate)
Scenario (场景) | Query Generation | Duration | Samples/query | Latency Constraint | Tail Latency | Performance Metric |
Single stream | LoadGen sends next query as soon as SUT completes the previous query:串行方式,query一次完整再下一次 | 1024 queries and 60 seconds | 1 | None | 90% | 90%-ile measured latency |
Multiple stream | LoadGen sends a new query every latency constraint if the SUT has completed the prior query, otherwise the new query is dropped and is counted as one overtime query 1、query 的samples随机 2、 如果超过latency constraint,query完成不会立刻query即drop,而是下个周期再query,导致总的耗时变长 | 270,336 queries and 60 seconds | Variable, see metric | Benchmark specific | 99% | Maximum number of inferences per query supported |
Server | LoadGen sends new queries to the SUT according to a Poisson distribution:根据泊松分布query | 270,336 queries and 60 seconds | 1 | Benchmark specific | 99% | Maximum Poisson throughput parameter supported |
Offline | LoadGen sends all queries to the SUT at start:极限性能(极限处理能力) | 1 query and 60 seconds | At least 24,576 | None | N/A | Measured throughput |
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。