Performance

This chapter describes in detail the performance of the Vitis™ AI Library on the following different boards.

  • ZCU102 (0432055-05)
  • ZCU104
  • Alveo U50
  • Alveo U50lv
  • Alveo U280

ZCU102 Performance

The ZCU102 evaluation board uses the mid-range ZU9 UltraScale+ device. There are two different hardware versions of ZCU102 board, one with the serial number 0432055-04 as the header and the other with the serial number 0432055-05 as the header. The performance of the Vitis AI Library varies between the two hardware versions (because of different DDR performance). Since 0432055-04 version of ZCU102 has been discontinued, the following table only shows the performance of ZCU102 (0432055-05). In ZCU102 board, triple B4096F DPU cores are implemented in program logic.

Refer to the following table for throughput performance (in frames/sec or fps) for various neural network samples on ZCU102 (0432055-05) with DPU running at 281 MHz.

Note: The DPU on the ZCU102 has hardware softmax acceleration module. Due to the limitation of hardware softmax module, the software softmax will be faster when the number of categories reaches 1000. Set XLNX_ENABLE_C_SOFTMAX=1 to enable the software softmax: softmax_c. The default value of XLNX_ENABLE_C_SOFTMAX is 0 which means the softmax method will be selected according to the following priorities.
  1. Neon Acceleration
  2. Hardware Softmax
  3. Software Softmax_c

For ZCU102, you can use the following command to test the performance of classification.

env XLNX_ENABLE_C_SOFTMAX=1 ./test_performance_classification resnet50 test_performance_classification.list -t 8 -s 60
Table 1. ZCU102 (0432055-05) Performance
No Neural Network Input Size GOPS Performance (fps) (Single thread) Performance (fps) (Multiple thread)
1 inception_resnet_v2_tf 299x299 26.4 23.1 48.7
2 inception_v1_tf 224x224 3.0 184.1 423.9
3 inception_v3_tf 299x299 11.5 57.3 126.7
4 inception_v4_2016_09_09_tf 299x299 24.6 28.5 66.2
5 mobilenet_v1_0_25_128_tf 128x128 0.027 1170.7 4043.5
6 mobilenet_v1_0_5_160_tf 160x160 0.15 707.6 2007.1
7 mobilenet_v1_1_0_224_tf 224x224 1.1 284.3 754.9
8 mobilenet_v2_1_0_224_tf 224x224 0.60 230.8 568.4
9 mobilenet_v2_1_4_224_tf 224x224 1.2 167.3 393.1
10 resnet_v1_101_tf 224x224 14.4 43.1 91.3
11 resnet_v1_152_tf 224x224 21.8 29.6 63.7
12 resnet_v1_50_tf 224x224 7.0 79.1 161.9
13 vgg_16_tf 224x224 31.0 20.1 40.9
14 vgg_19_tf 224x224 39.3 17.3 36.5
15 ssd_mobilenet_v1_coco_tf 300x300 2.5 90.1 332.9
16 ssd_mobilenet_v2_coco_tf 300x300 3.8 63.9 193.2
17 ssd_resnet_50_fpn_coco_tf 640x640 178.4 1.3 5.1
18 yolov3_voc_tf 416x416 65.6 13.5 35
19 mlperf_ssd_resnet34_tf 1200x1200 433 2 7.2
20 resnet50 224x224 7.7 73.5 152.7
21 resnet18 224x224 3.7 186.9 441.6
22 inception_v1 224x224 3.2 178 411.7
23 inception_v2 224x224 4.0 144.4 317.3
24 inception_v3 299x299 11.4 57.5 128.1
25 inception_v4 299x299 24.5 28.5 66.2
26 mobilenet_v2 224x224 0.6 226.8 548
27 squeezenet 227x227 0.76 265.8 1012.3
28 ssd_pedestrain_pruned_0_97 360x360 5.9 76.3 282.6
29 ssd_traffic_pruned_0_9 360x480 11.6 54.5 201.5
30 ssd_adas_pruned_0_95 360x480 6.3 82.9 279.7
31 ssd_mobilenet_v2 360x480 6.6 38.4 114.4
32 refinedet_pruned_0_8 360x480 25 31.7 101.3
33 refinedet_pruned_0_92 360x480 10.1 59.9 196.8
34 refinedet_pruned_0_96 360x480 5.1 82.9 276.2
35 vpgnet_pruned_0_99 480x640 2.5 104.5 381.4
36 fpn 256x512 8.9 59.7 175.5
37 sp_net 128x224 0.55 381.6 1317.4
38 openpose_pruned_0_3 368x368 49.9 3.5 15.1
39 densebox_320_320 320x320 0.49 390 1172.3
40 densebox_640_360 360x640 1.1 200.4 588.7
41 face_landmark 96x72 0.14 849.4 1382.7
42 reid 80x160 0.95 364.2 665.6
43 multi_task 288x512 14.8 35.5 127.7
44 yolov3_adas_pruned_0_9 256x512 5.5 84.1 229.7
45 yolov3_voc 416x416 65.4 13.5 35.3
46 yolov3_bdd 288x512 53.7 13 34.3
47 yolov2_voc 448x448 34 26.8 71
48 yolov2_voc_pruned_0_66 448x448 11.6 63.2 185.9
49 yolov2_voc_pruned_0_71 448x448 9.9 72.8 214.8
50 yolov2_voc_pruned_0_77 448x448 7.8 85.2 258.7
51 facerec_resnet20 112x96 3.5 167.1 320.6
52 facerec_resnet64 112x96 11.0 73 173
53 plate_detection 320x320 0.49 500 1792.2
54 plate_recognition 96x288 1.75 113.4 383.2
55 FPN_Res18_Medical_segmentation 320x320 45.3 12.2 40.3
56 refinedet_baseline 480x360 123 8.3 24.4

ZCU104 Performance

The ZCU104 evaluation board uses the mid-range ZU7ev UltraScale+ device. Dual B4096F DPU cores are implemented in program logic and delivers 2.4 TOPS INT8 peak performance for deep learning inference acceleration.

Refer to the following table for the throughput performance (in frames/sec or fps) for various neural network samples on ZCU104 with DPU running at 300 MHz.

Table 2. ZCU104 Performance
No Neural Network Input Size GOPS Performance (fps) (Single thread) Performance (fps) (Multiple thread)
1 inception_resnet_v2_tf 299x299 26.4 25 46.2
2 inception_v1_tf 224x224 3.0 197.5 403.1
3 inception_v3_tf 299x299 11.5 60.6 117.4
4 inception_v4_2016_09_09_tf 299x299 24.6 30.3 58.4
5 mobilenet_v1_0_25_128_tf 128x128 0.027 1197.2 3744.2
6 mobilenet_v1_0_5_160_tf 160x160 0.15 737.6 1941.7
7 mobilenet_v1_1_0_224_tf 224x224 1.1 309.1 719.3
8 mobilenet_v2_1_0_224_tf 224x224 0.60 244.7 529.2
9 mobilenet_v2_1_4_224_tf 224x224 1.2 179.9 370.7
10 resnet_v1_101_tf 224x224 14.4 46.1 86.2
11 resnet_v1_152_tf 224x224 21.8 31.6 59.1
12 resnet_v1_50_tf 224x224 7.0 84.6 158.3
13 vgg_16_tf 224x224 31.0 21.3 37
14 vgg_19_tf 224x224 39.3 18.4 32.6
15 ssd_mobilenet_v1_coco_tf 300x300 2.5 93.2 294.6
16 ssd_mobilenet_v2_coco_tf 300x300 3.8 66.3 185.4
17 ssd_resnet_50_fpn_coco_tf 640x640 178.4 1.4 5.2
18 yolov3_voc_tf 416x416 65.6 14.1 29.2
19 mlperf_ssd_resnet34_tf 1200x1200 433 1.7 5.4
20 resnet50 224x224 7.7 78.7 148.1
21 resnet18 224x224 3.7 197.4 411.1
22 inception_v1 224x224 3.2 190.8 389.4
23 inception_v2 224x224 4.0 153.5 302.1
24 inception_v3 299x299 11.4 60.8 117.9
25 inception_v4 299x299 24.5 30.3 58.3
26 mobilenet_v2 224x224 0.6 242.8 520.8
27 squeezenet 227x227 0.76 271 943.6
28 ssd_pedestrain_pruned_0_97 360x360 5.9 78.5 220.5
29 ssd_traffic_pruned_0_9 360x480 11.6 56.2 152.9
30 ssd_adas_pruned_0_95 360x480 6.3 84.9 231.9
31 ssd_mobilenet_v2 360x480 6.6 25.7 101.4
32 refinedet_pruned_0_8 360x480 25 32.6 75.9
33 refinedet_pruned_0_92 360x480 10.1 61.3 154.1
34 refinedet_pruned_0_96 360x480 5.1 83.7 228.4
35 vpgnet_pruned_0_99 480x640 2.5 107.3 354.9
36 fpn 256x512 8.9 61.9 169.4
37 sp_net 128x224 0.55 494.4 1209.8
38 openpose_pruned_0_3 368x368 49.9 3.7 10.9
39 densebox_320_320 320x320 0.49 397.3 1263.9
40 densebox_640_360 360x640 1.1 204.1 621.9
41 face_landmark 96x72 0.14 891.4 1449.5
42 reid 80x160 0.95 387.7 700.3
43 multi_task 288x512 14.8 36 109.1
44 yolov3_adas_pruned_0_9 256x512 5.5 85.2 221.5
45 yolov3_voc 416x416 65.4 14.2 29.5
46 yolov3_bdd 288x512 53.7 13.6 28.6
47 yolov2_voc 448x448 34 28.4 58.7
48 yolov2_voc_pruned_0_66 448x448 11.6 66.6 152.9
49 yolov2_voc_pruned_0_71 448x448 9.9 76.6 179.5
50 yolov2_voc_pruned_0_77 448x448 7.8 89.4 216.1
51 facerec_resnet20 112x96 3.5 177.6 309
52 facerec_resnet64 112x96 11.0 77.7 147.5
53 plate_detection 320x320 0.49 501.8 1761.2
54 plate_recognition 96x288 1.75 225.3 541.2
55 FPN_Res18_Medical_segmentation 320x320 45.3 12.7 31.5
56 refinedet_baseline 480x360 123 8.7 18.2

U50/U50lv Performance

The Xilinx® Alveo U50 Data Center accelerator cards are peripheral component interconnect express (PCIe®) Gen3x16 compliant and Gen4x8 compatible cards featuring the Xilinx 16 nm UltraScale+ technology. In this release, DPU is implemented in program logic for deep learning inference acceleration.

Refer to the following table for the throughput performance (in frames/sec or fps) for various neural network samples on U50 Gen3x4 with DPU running at 6E@300 MHz.
Note: Some models cannot run at the highest frequency of DPU and need DPU frequency reduction. See Setting Up the Host for DPU frequency reduction operation.
Table 3. U50 Performance with 6E300Mhz DPU
No Neural Network Input Size GOPS DPU Frequency (Mhz) Performance (fps) (Multiple thread)
1 inception_resnet_v2_tf 299x299 26.4 300 173.2
2 inception_v1_tf 224x224 3.0 300 1195.8
3 inception_v3_tf 299x299 11.5 300 398.5
4 inception_v4_2016_09_09_tf 299x299 24.6 300 187.6
5 mobilenet_v1_0_25_128_tf 128x128 0.027 N/A N/A
6 mobilenet_v1_0_5_160_tf 160x160 0.15 N/A N/A
7 mobilenet_v1_1_0_224_tf 224x224 1.1 N/A N/A
8 mobilenet_v2_1_0_224_tf 224x224 0.60 N/A N/A
9 mobilenet_v2_1_4_224_tf 224x224 1.2 N/A N/A
10 resnet_v1_101_tf 224x224 14.4 300 365.1
11 resnet_v1_152_tf 224x224 21.8 300 244.7
12 resnet_v1_50_tf 224x224 7.0 300 703.8
13 vgg_16_tf 224x224 31.0 300 164.7
14 vgg_19_tf 224x224 39.3 300 137
15 ssd_mobilenet_v1_coco_tf 300x300 2.5 N/A N/A
16 ssd_mobilenet_v2_coco_tf 300x300 3.8 N/A N/A
17 ssd_resnet_50_fpn_coco_tf 640x640 178.4 300x0.9 32.7
18 yolov3_voc_tf 416x416 65.6 300x0.9 79.2
19 mlperf_ssd_resnet34_tf 1200x1200 433 N/A N/A
20 resnet50 224x224 7.7 300 631.2
21 resnet18 224x224 3.7 300 1430
22 inception_v1 224x224 3.2 300 1183.3
23 inception_v2 224x224 4.0 300 983.6
24 inception_v3 299x299 11.4 300 405.4
25 inception_v4 299x299 24.5 300 187.7
26 mobilenet_v2 224x224 0.6 N/A N/A
27 squeezenet 227x227 0.76 300 3016.1
28 ssd_pedestrain_pruned_0_97 360x360 5.9 300 621.5
29 ssd_traffic_pruned_0_9 360x480 11.6 300 433
30 ssd_adas_pruned_0_95 360x480 6.3 300 629
31 ssd_mobilenet_v2 360x480 6.6 N/A N/A
32 refinedet_pruned_0_8 360x480 25 300x0.9 193.6
33 refinedet_pruned_0_92 360x480 10.1 300x0.9 420.6
34 refinedet_pruned_0_96 360x480 5.1 300x0.9 617.2
35 vpgnet_pruned_0_99 480x640 2.5 300 478.8
36 fpn 256x512 8.9 300 450.9
37 sp_net 128x224 0.55 300 1158.5
38 openpose_pruned_0_3 368x368 49.9 300x0.9 29.1
39 densebox_320_320 320x320 0.49 300 1929.2
40 densebox_640_360 360x640 1.1 300 877.3
41 face_landmark 96x72 0.14 300 8513.7
42 reid 80x160 0.95 300 3612.9
43 multi_task 288x512 14.8 300 237.3
44 yolov3_adas_pruned_0_9 256x512 5.5 300x0.9 642.9
45 yolov3_voc 416x416 65.4 300x0.9 79
46 yolov3_bdd 288x512 53.7 300x0.9 77.2
47 yolov2_voc 448x448 34 300x0.9 165.6
48 yolov2_voc_pruned_0_66 448x448 11.6 300x0.9 409.6
49 yolov2_voc_pruned_0_71 448x448 9.9 300x0.9 481.5
50 yolov2_voc_pruned_0_77 448x448 7.8 300x0.9 585.4
51 facerec_resnet20 112x96 3.5 300 1278.3
52 facerec_resnet64 112x96 11.0 300 495.7
53 plate_detection 320x320 0.49 300 5135.8
54 plate_recognition 96x288 1.75 N/A N/A
55 FPN_Res18_Medical_segmentation 320x320 45.3 300 103.1
56 refinedet_baseline 480x360 123 300x0.9 50
57 resnet50_pt 224x224 4.1 300 546.4
58 squeezenet_pt 224x224 0.82 300 2024.4
59 inception_v3_pt 299x299 5.7 300 405.3

The following table shows the throughput performance (in frames/sec or fps) for various neural network samples on U50lv Gen3x4 with DPU running at 9E@275 MHz.

Table 4. U50lv Performance with 9E275Mhz DPU
No Neural Network Input Size GOPS DPU Frequency (Mhz) Performance (fps) (Multiple thread)
1 inception_resnet_v2_tf 299x299 26.4 275 224.1
2 inception_v1_tf 224x224 3.0 275 1607.4
3 inception_v3_tf 299x299 11.5 275 549.7
4 inception_v4_2016_09_09_tf 299x299 24.6 275 256.5
5 mobilenet_v1_0_25_128_tf 128x128 0.027 N/A N/A
6 mobilenet_v1_0_5_160_tf 160x160 0.15 N/A N/A
7 mobilenet_v1_1_0_224_tf 224x224 1.1 N/A N/A
8 mobilenet_v2_1_0_224_tf 224x224 0.60 N/A N/A
9 mobilenet_v2_1_4_224_tf 224x224 1.2 N/A N/A
10 resnet_v1_101_tf 224x224 14.4 275 458.1
11 resnet_v1_152_tf 224x224 21.8 275 305.9
12 resnet_v1_50_tf 224x224 7.0 275 880.6
13 vgg_16_tf 224x224 31.0 275 228.9
14 vgg_19_tf 224x224 39.3 275 189.9
15 ssd_mobilenet_v1_coco_tf 300x300 2.5 N/A N/A
16 ssd_mobilenet_v2_coco_tf 300x300 3.8 N/A N/A
17 ssd_resnet_50_fpn_coco_tf 640x640 178.4 275x0.9 42.6
18 yolov3_voc_tf 416x416 65.6 275x0.9 104
19 mlperf_ssd_resnet34_tf 1200x1200 433 N/A N/A
20 resnet50 224x224 7.7 275 802.5
21 resnet18 224x224 3.7 275 1927.4
22 inception_v1 224x224 3.2 275 1565.3
23 inception_v2 224x224 4.0 275 1289.1
24 inception_v3 299x299 11.4 275 552.4
25 inception_v4 299x299 24.5 275 256.2
26 mobilenet_v2 224x224 0.6 N/A N/A
27 squeezenet 227x227 0.76 275 3767.1
28 ssd_pedestrain_pruned_0_97 360x360 5.9 275 664.2
29 ssd_traffic_pruned_0_9 360x480 11.6 275 483.3
30 ssd_adas_pruned_0_95 360x480 6.3 275 715
31 ssd_mobilenet_v2 360x480 6.6 N/A N/A
32 refinedet_pruned_0_8 360x480 25 275 235.6
33 refinedet_pruned_0_92 360x480 10.1 275 514.7
34 refinedet_pruned_0_96 360x480 5.1 275 725.8
35 vpgnet_pruned_0_99 480x640 2.5 275 595.3
36 fpn 256x512 8.9 275x0.9 530.9
37 sp_net 128x224 0.55 275 2687.7
38 openpose_pruned_0_3 368x368 49.9 275 43.3
39 densebox_320_320 320x320 0.49 275 2431.2
40 densebox_640_360 360x640 1.1 275 1074.4
41 face_landmark 96x72 0.14 275 11759.4
42 reid 80x160 0.95 275 5013.9
43 multi_task 288x512 14.8 275 192.2
44 yolov3_adas_pruned_0_9 256x512 5.5 275x0.9 810
45 yolov3_voc 416x416 65.4 275x0.9 104.2
46 yolov3_bdd 288x512 53.7 275x0.9 103
47 yolov2_voc 448x448 34 275x0.9 227.5
48 yolov2_voc_pruned_0_66 448x448 11.6 275x0.9 565.2
49 yolov2_voc_pruned_0_71 448x448 9.9 275x0.9 662.6
50 yolov2_voc_pruned_0_77 448x448 7.8 275x0.9 807.8
51 facerec_resnet20 112x96 3.5 275 1760.9
52 facerec_resnet64 112x96 11.0 275 663.7
53 plate_detection 320x320 0.49 275 5563.8
54 plate_recognition 96x288 1.75 N/A N/A
55 FPN_Res18_Medical_segmentation 320x320 45.3 275 140.2
56 refinedet_baseline 480x360 123 275 70.5
57 resnet50_pt 224x224 4.1 275 768.1
58 squeezenet_pt 224x224 0.82 275 2540.6
59 inception_v3_pt 299x299 5.7 275 551.5

The following table shows the throughput performance (in frames/sec or fps) for various neural network samples on U50lv Gen3x4 with DPU running at 10E@275 MHz.

Table 5. U50lv Performance with 10E275Mhz DPU
No Neural Network Input Size GOPS DPU Frequency (Mhz) Performance (fps) (Multiple thread)
1 inception_resnet_v2_tf 299x299 26.4 N/A N/A
2 inception_v1_tf 224x224 3.0 275x0.9 1552.5
3 inception_v3_tf 299x299 11.5 N/A N/A
4 inception_v4_2016_09_09_tf 299x299 24.6 N/A N/A
5 mobilenet_v1_0_25_128_tf 128x128 0.027 N/A N/A
6 mobilenet_v1_0_5_160_tf 160x160 0.15 N/A N/A
7 mobilenet_v1_1_0_224_tf 224x224 1.1 N/A N/A
8 mobilenet_v2_1_0_224_tf 224x224 0.60 N/A N/A
9 mobilenet_v2_1_4_224_tf 224x224 1.2 N/A N/A
10 resnet_v1_101_tf 224x224 14.4 275x0.9 458.9
11 resnet_v1_152_tf 224x224 21.8 275x0.9 306.5
12 resnet_v1_50_tf 224x224 7.0 275x0.9 882.93
13 vgg_16_tf 224x224 31.0 275x0.9 229.3
14 vgg_19_tf 224x224 39.3 275x0.9 189.9
15 ssd_mobilenet_v1_coco_tf 300x300 2.5 N/A N/A
16 ssd_mobilenet_v2_coco_tf 300x300 3.8 N/A N/A
17 ssd_resnet_50_fpn_coco_tf 640x640 178.4 275x0.8 41.8
18 yolov3_voc_tf 416x416 65.6 275x0.8 102.4
19 mlperf_ssd_resnet34_tf 1200x1200 433 N/A N/A
20 resnet50 224x224 7.7 275x0.9 802.5
21 resnet18 224x224 3.7 275x0.9 1934.5
22 inception_v1 224x224 3.2 275x0.9 1536.6
23 inception_v2 224x224 4.0 275x0.9 1314
24 inception_v3 299x299 11.4 N/A N/A
25 inception_v4 299x299 24.5 N/A N/A
26 mobilenet_v2 224x224 0.6 N/A N/A
27 squeezenet 227x227 0.76 275x0.9 3451.1
28 ssd_pedestrain_pruned_0_97 360x360 5.9 275x0.9 755.2
29 ssd_traffic_pruned_0_9 360x480 11.6 275x0.9 570.8
30 ssd_adas_pruned_0_95 360x480 6.3 275x0.9 818.2
31 ssd_mobilenet_v2 360x480 6.6 N/A N/A
32 refinedet_pruned_0_8 360x480 25 275x0.9 273.8
33 refinedet_pruned_0_92 360x480 10.1 275x0.9 574.8
34 refinedet_pruned_0_96 360x480 5.1 275x0.9 795.1
35 vpgnet_pruned_0_99 480x640 2.5 275 659
36 fpn 256x512 8.9 275x0.9 552.2
37 sp_net 128x224 0.55 275 1707
38 openpose_pruned_0_3 368x368 49.9 275x0.8 39.7
39 densebox_320_320 320x320 0.49 275 2572.7
40 densebox_640_360 360x640 1.1 275 1125.1
41 face_landmark 96x72 0.14 275 12917.2
42 reid 80x160 0.95 275 5548.1
43 multi_task 288x512 14.8 275x0.9 177
44 yolov3_adas_pruned_0_9 256x512 5.5 275x0.8 771.3
45 yolov3_voc 416x416 65.4 275x0.8 102.2
46 yolov3_bdd 288x512 53.7 275x0.8 100.6
47 yolov2_voc 448x448 34 275x0.8 223.3
48 yolov2_voc_pruned_0_66 448x448 11.6 275x0.8 547.6
49 yolov2_voc_pruned_0_71 448x448 9.9 275x0.8 639.1
50 yolov2_voc_pruned_0_77 448x448 7.8 275x0.8 770.9
51 facerec_resnet20 112x96 3.5 275 1943.4
52 facerec_resnet64 112x96 11.0 275 736.4
53 plate_detection 320x320 0.49 275 5521.4
54 plate_recognition 96x288 1.75 N/A N/A
55 FPN_Res18_Medical_segmentation 320x320 45.3 275x0.9 139.8
56 refinedet_baseline 480x360 123 N/A N/A
57 resnet50_pt 224x224 4.1 275 764.6
58 squeezenet_pt 224x224 0.82 275x0.9 2393.2
59 inception_v3_pt 299x299 5.7 N/A N/A

U280 Performance

The Xilinx® Alveo U280 Data Center accelerator cards are peripheral component interconnect express (PCIe®) Gen3x16 compliant and Gen4x8 compatible cards featuring the Xilinx 16 nm UltraScale+ technology. In this release, DPU is implemented in program logic for deep learning inference acceleration.

Refer to the following table for the throughput performance (in frames/sec or fps) for various neural network samples on U280 Gen3x16 with DPU running at 14E@300 MHz.
Note: Some models cannot run at the highest frequency of DPU and need DPU frequency reduction. See the Setting Up the Host for DPU frequency reduction operation.
Table 6. U280 Performance with 14E300Mhz DPU
No Neural Network Input Size GOPS DPU Frequency (Mhz) Performance (fps) (Multiple thread)
1 inception_resnet_v2_tf 299x299 26.4 300x0.5 150.1
2 inception_v1_tf 224x224 3.0 300x0.5 1117.9
3 inception_v3_tf 299x299 11.5 300x0.5 371.8
4 inception_v4_2016_09_09_tf 299x299 24.6 300x0.5 168
5 mobilenet_v1_0_25_128_tf 128x128 0.027 N/A N/A
6 mobilenet_v1_0_5_160_tf 160x160 0.15 N/A N/A
7 mobilenet_v1_1_0_224_tf 224x224 1.1 N/A N/A
8 mobilenet_v2_1_0_224_tf 224x224 0.60 N/A N/A
9 mobilenet_v2_1_4_224_tf 224x224 1.2 N/A N/A
10 resnet_v1_101_tf 224x224 14.4 300x0.5 387.5
11 resnet_v1_152_tf 224x224 21.8 300x0.5 258.9
12 resnet_v1_50_tf 224x224 7.0 300x0.6 890.3
13 vgg_16_tf 224x224 31.0 300x0.5 182.7
14 vgg_19_tf 224x224 39.3 300x0.5 153.1
15 ssd_mobilenet_v1_coco_tf 300x300 2.5 N/A N/A
16 ssd_mobilenet_v2_coco_tf 300x300 3.8 N/A N/A
17 ssd_resnet_50_fpn_coco_tf 640x640 178.4 300x0.5 28.8
18 yolov3_voc_tf 416x416 65.6 300x0.6 112.4
19 mlperf_ssd_resnet34_tf 1200x1200 433 N/A N/A
20 resnet50 224x224 7.7 300x0.7 918.1
21 resnet18 224x224 3.7 300x0.5 1634.4
22 inception_v1 224x224 3.2 300x0.5 1069.5
23 inception_v2 224x224 4.0 300x0.5 937
24 inception_v3 299x299 11.4 300x0.5 372
25 inception_v4 299x299 24.5 300x0.5 167
26 mobilenet_v2 224x224 0.6 N/A N/A
27 squeezenet 227x227 0.76 300x0.5 2821.7
28 ssd_pedestrain_pruned_0_97 360x360 5.9 300x0.5 423.3
29 ssd_traffic_pruned_0_9 360x480 11.6 300x0.5 306
30 ssd_adas_pruned_0_95 360x480 6.3 300x0.5 476.1
31 ssd_mobilenet_v2 360x480 6.6 N/A N/A
32 refinedet_pruned_0_8 360x480 25 N/A N/A
33 refinedet_pruned_0_92 360x480 10.1 N/A N/A
34 refinedet_pruned_0_96 360x480 5.1 N/A N/A
35 vpgnet_pruned_0_99 480x640 2.5 300x0.5 567.8
36 fpn 256x512 8.9 300x0.5 362.9
37 sp_net 128x224 0.55 300x0.5 2126.6
38 openpose_pruned_0_3 368x368 49.9 300x0.5 36.5
39 densebox_320_320 320x320 0.49 300x0.5 2622.3
40 densebox_640_360 360x640 1.1 300x0.5 1138.8
41 face_landmark 96x72 0.14 300x0.5 11302.4
42 reid 80x160 0.95 300x0.5 4608
43 multi_task 288x512 14.8 300x0.5 128.3
44 yolov3_adas_pruned_0_9 256x512 5.5 300x0.6 893.1
45 yolov3_voc 416x416 65.4 300x0.6 113.6
46 yolov3_bdd 288x512 53.7 300x0.6 108.6
47 yolov2_voc 448x448 34 N/A N/A
48 yolov2_voc_pruned_0_66 448x448 11.6 300x0.5 490.3
49 yolov2_voc_pruned_0_71 448x448 9.9 300x0.5 570.3
50 yolov2_voc_pruned_0_77 448x448 7.8 300x0.5 679.6
51 facerec_resnet20 112x96 3.5 300x0.5 1576.9
52 facerec_resnet64 112x96 11.0 300x0.5 575.4
53 plate_detection 320x320 0.49 300x0.5 4235.7
54 plate_recognition 96x288 1.75 N/A N/A
55 FPN_Res18_Medical_segmentation 320x320 45.3 300x0.5 104.9
56 refinedet_baseline 480x360 123 N/A N/A
57 resnet50_pt 224x224 4.1 300x0.7 878.4
58 squeezenet_pt 224x224 0.82 300x0.5 1655.7
59 inception_v3_pt 299x299 5.7 300x0.5 371