You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In order to make an informed decision about the offset type solution, we would like to have a complete understanding of the impact on each CUB algorithm. Therefore, for each algorithm, we'd like a table that summarizes the following:
Performance with i32, u32, i64, u64 offset type
Verify that the algorithm still passes unit tests with this offset type
Report box plot of performance impacts across range of data types, input sizes, etc.
Can be summarized in table as min/max/median
Algorithms
device_exclusive_scan_max,
device_exclusive_scan_sum,
device_exclusive_scan_by_key,
device_select_flagged
device_select_if,
device_partition_if,
device_partition_flagged,
device_run_length_encode,
device_run_length_encode_non_trivial_runs,
device_segmented_sort,
System & environment
CTK 12.5
Ubuntu 22.04
H100
Benchmark data
device_exclusive_scan_max
Results
T{ct}
Elements{io}
I32
u32
u32/i32 time
i64
i64/i32 time
u64
u64/i32 time
I8
2^16 = 65536
8.91
9.367
105.13%
9.232
103.61%
9.359
105.04%
I8
2^20 = 1048576
13.494
13.368
99.07%
13.124
97.26%
13.187
97.72%
I8
2^24 = 16777216
53.134
52.893
99.55%
53.066
99.87%
53.063
99.87%
I8
2^28 = 268435456
729.843
729.989
100.02%
731.428
100.22%
731.704
100.25%
I16
2^16 = 65536
9.752
9.715
99.62%
9.792
100.41%
9.631
98.76%
I16
2^20 = 1048576
13.951
14.187
101.69%
14.069
100.85%
14.206
101.83%
I16
2^24 = 16777216
67.863
67.839
99.96%
67.01
98.74%
66.97
98.68%
I16
2^28 = 268435456
952.537
951.78
99.92%
978.596
102.74%
977.729
102.64%
I32
2^16 = 65536
9.769
9.261
94.80%
9.289
95.09%
9.176
93.93%
I32
2^20 = 1048576
15.022
14.511
96.60%
14.512
96.60%
14.564
96.95%
I32
2^24 = 16777216
99.151
98.738
99.58%
98.688
99.53%
98.659
99.50%
I32
2^28 = 268435456
1456
1456
100.00%
1455
99.93%
1455
99.93%
I64
2^16 = 65536
9.573
9.634
100.64%
9.478
99.01%
9.584
100.11%
I64
2^20 = 1048576
19.037
18.873
99.14%
19.032
99.97%
18.964
99.62%
I64
2^24 = 16777216
171.205
171.247
100.02%
171.164
99.98%
171.331
100.07%
I64
2^28 = 268435456
2631
2632
100.04%
2631
100.00%
2630
99.96%
I128
2^16 = 65536
16.374
16.465
100.56%
16.523
100.91%
16.542
101.03%
I128
2^20 = 1048576
65.597
65.4
99.70%
65.036
99.14%
65.177
99.36%
I128
2^24 = 16777216
868.064
868.039
100.00%
866.639
99.84%
867.766
99.97%
I128
2^28 = 268435456
13727
13724
99.98%
13715
99.91%
13716
99.92%
F32
2^16 = 65536
9.441
9.475
100.36%
9.38
99.35%
9.433
99.92%
F32
2^20 = 1048576
14.47
14.477
100.05%
14.584
100.79%
14.568
100.68%
F32
2^24 = 16777216
98.968
98.861
99.89%
98.991
100.02%
99.018
100.05%
F32
2^28 = 268435456
1459
1459
100.00%
1459
100.00%
1459
100.00%
F64
2^16 = 65536
9.767
9.579
98.08%
9.485
97.11%
9.711
99.43%
F64
2^20 = 1048576
18.928
18.857
99.62%
19.034
100.56%
18.935
100.04%
F64
2^24 = 16777216
173.845
173.919
100.04%
173.85
100.00%
173.954
100.06%
F64
2^28 = 268435456
2695
2695
100.00%
2695
100.00%
2694
99.96%
C64
2^16 = 65536
67.742
67.006
98.91%
66.853
98.69%
67.309
99.36%
C64
2^20 = 1048576
114.314
114.531
100.19%
115.019
100.62%
115.035
100.63%
C64
2^24 = 16777216
1165
1162
99.74%
1155
99.14%
1150
98.71%
C64
2^28 = 268435456
17419
17458
100.22%
17380
99.78%
17475
100.32%
device_exclusive_scan_sum
Results
T{ct}
Elements{io}
I32
u32
u32/i32 time
i64
i64/i32 time
u64
u64/i32 time
I8
2^16 = 65536
8.761
9.183
104.82%
9.104
103.92%
9.29
106.04%
I8
2^20 = 1048576
12.446
12.639
101.55%
12.35
99.23%
12.103
97.24%
I8
2^24 = 16777216
49.837
50.075
100.48%
52.084
104.51%
52.143
104.63%
I8
2^28 = 268435456
649.015
650.869
100.29%
685.588
105.64%
686.283
105.74%
I16
2^16 = 65536
9.739
9.557
98.13%
9.748
100.09%
9.58
98.37%
I16
2^20 = 1048576
13.246
13.363
100.88%
13.202
99.67%
13.321
100.57%
I16
2^24 = 16777216
60.795
60.863
100.11%
60.789
99.99%
60.878
100.14%
I16
2^28 = 268435456
799.135
798.496
99.92%
799.297
100.02%
799.511
100.05%
I32
2^16 = 65536
9.598
9.212
95.98%
9.342
97.33%
9.176
95.60%
I32
2^20 = 1048576
14.984
14.715
98.20%
14.78
98.64%
14.795
98.74%
I32
2^24 = 16777216
90.713
90.552
99.82%
90.508
99.77%
90.405
99.66%
I32
2^28 = 268435456
1366
1366
100.00%
1367
100.07%
1366
100.00%
I64
2^16 = 65536
9.714
9.792
100.80%
9.626
99.09%
9.938
102.31%
I64
2^20 = 1048576
17.97
17.932
99.79%
18.348
102.10%
18.371
102.23%
I64
2^24 = 16777216
160.284
160.186
99.94%
160.507
100.14%
160.582
100.19%
I64
2^28 = 268435456
2476
2475
99.96%
2476
100.00%
2476
100.00%
I128
2^16 = 65536
14.439
14.311
99.11%
14.591
101.05%
14.559
100.83%
I128
2^20 = 1048576
38.383
38.23
99.60%
38.186
99.49%
38.178
99.47%
I128
2^24 = 16777216
388.32
388.111
99.95%
388.071
99.94%
387.488
99.79%
I128
2^28 = 268435456
6036
6034
99.97%
6027
99.85%
6026
99.83%
F32
2^16 = 65536
9.63
9.58
99.48%
9.682
100.54%
9.694
100.66%
F32
2^20 = 1048576
14.878
14.867
99.93%
14.885
100.05%
14.976
100.66%
F32
2^24 = 16777216
90.628
90.576
99.94%
90.608
99.98%
90.615
99.99%
F32
2^28 = 268435456
1364
1364
100.00%
1363
99.93%
1363
99.93%
F64
2^16 = 65536
10.154
9.662
95.15%
9.764
96.16%
9.629
94.83%
F64
2^20 = 1048576
18.582
18.327
98.63%
18.171
97.79%
18.377
98.90%
F64
2^24 = 16777216
161.011
160.632
99.76%
160.657
99.78%
160.797
99.87%
F64
2^28 = 268435456
2479
2480
100.04%
2479
100.00%
2479
100.00%
C64
2^16 = 65536
12.738
12.733
99.96%
12.641
99.24%
12.891
101.20%
C64
2^20 = 1048576
29.051
28.972
99.73%
29.436
101.33%
29.412
101.24%
C64
2^24 = 16777216
291.342
291.891
100.19%
291.113
99.92%
291.2
99.95%
C64
2^28 = 268435456
4543
4550
100.15%
4530
99.71%
4530
99.71%
device_exclusive_scan_by_key
Results
KeyT{ct}
ValueT{ct}
Elements{io}
I32
u32
u32/i32 time
i64
i64/i32 time
u64
u64/i32 time
I8
I8
2^16 = 65536
9.966
9.955
99.89%
10.134
101.69%
10.314
103.49%
I8
I8
2^20 = 1048576
14.122
14.316
101.37%
14.502
102.69%
14.51
102.75%
I8
I8
2^24 = 16777216
69.589
70.034
100.64%
75.626
108.68%
75.465
108.44%
I8
I8
2^28 = 268435456
943.499
947.544
100.43%
1050
111.29%
1047
110.97%
I8
I16
2^16 = 65536
11.669
11.72
100.44%
12.03
103.09%
12.053
103.29%
I8
I16
2^20 = 1048576
15.11
15.129
100.13%
15.401
101.93%
15.408
101.97%
I8
I16
2^24 = 16777216
87.262
80.269
91.99%
89.469
102.53%
89.615
102.70%
I8
I16
2^28 = 268435456
1176
1050
89.29%
1210
102.89%
1213
103.15%
I8
I32
2^16 = 65536
11.071
10.667
96.35%
11.084
100.12%
10.807
97.62%
I8
I32
2^20 = 1048576
17.406
17.274
99.24%
17.431
100.14%
17.272
99.23%
I8
I32
2^24 = 16777216
102.249
102.099
99.85%
111.93
109.47%
105.019
102.71%
I8
I32
2^28 = 268435456
1482
1482
100.00%
1594
107.56%
1517
102.36%
I8
I64
2^16 = 65536
10.801
10.853
100.48%
12.837
118.85%
12.802
118.53%
I8
I64
2^20 = 1048576
20.325
20.36
100.17%
27.543
135.51%
27.407
134.84%
I8
I64
2^24 = 16777216
175.416
175.46
100.03%
230.256
131.26%
228.794
130.43%
I8
I64
2^28 = 268435456
2628
2628
100.00%
3416
129.98%
3393
129.11%
I8
I128
2^16 = 65536
15.027
15.268
101.60%
15.411
102.56%
15.486
103.05%
I8
I128
2^20 = 1048576
39.536
39.466
99.82%
39.675
100.35%
39.816
100.71%
I8
I128
2^24 = 16777216
377.923
377.774
99.96%
379.113
100.31%
379.812
100.50%
I8
I128
2^28 = 268435456
5820
5822
100.03%
5841
100.36%
5854
100.58%
I16
I8
2^16 = 65536
10.173
10.126
99.54%
10.27
100.95%
10.328
101.52%
I16
I8
2^20 = 1048576
15.17
15.142
99.82%
15.316
100.96%
15.341
101.13%
I16
I8
2^24 = 16777216
76.868
76.437
99.44%
81.38
105.87%
81.35
105.83%
I16
I8
2^28 = 268435456
1034
1025
99.13%
1115
107.83%
1116
107.93%
I16
I16
2^16 = 65536
10.806
10.864
100.54%
10.857
100.47%
10.856
100.46%
I16
I16
2^20 = 1048576
15.546
15.759
101.37%
15.638
100.59%
15.722
101.13%
I16
I16
2^24 = 16777216
86.469
89.01
102.94%
89.666
103.70%
89.738
103.78%
I16
I16
2^28 = 268435456
1135
1162
102.38%
1183
104.23%
1183
104.23%
I16
I32
2^16 = 65536
10.925
11.328
103.69%
11.424
104.57%
11.362
104.00%
I16
I32
2^20 = 1048576
16.917
16.889
99.83%
17.317
102.36%
17.363
102.64%
I16
I32
2^24 = 16777216
107.751
107.719
99.97%
113.994
105.79%
113.756
105.57%
I16
I32
2^28 = 268435456
1563
1564
100.06%
1639
104.86%
1639
104.86%
I16
I64
2^16 = 65536
10.978
11.051
100.66%
12.78
116.41%
12.857
117.12%
I16
I64
2^20 = 1048576
20.974
20.938
99.83%
28.242
134.65%
28.125
134.09%
I16
I64
2^24 = 16777216
183.087
182.974
99.94%
233.706
127.65%
232.591
127.04%
I16
I64
2^28 = 268435456
2743
2743
100.00%
3458
126.07%
3438
125.34%
I16
I128
2^16 = 65536
15.678
15.674
99.97%
15.272
97.41%
15.507
98.91%
I16
I128
2^20 = 1048576
37.86
37.859
100.00%
40.635
107.33%
40.816
107.81%
I16
I128
2^24 = 16777216
359.871
360.242
100.10%
381.626
106.05%
382.443
106.27%
I16
I128
2^28 = 268435456
5497
5496
99.98%
5877
106.91%
5892
107.19%
I32
I8
2^16 = 65536
9.931
10.024
100.94%
10.258
103.29%
10.323
103.95%
I32
I8
2^20 = 1048576
15.61
15.709
100.63%
16.015
102.59%
16.033
102.71%
I32
I8
2^24 = 16777216
89.786
89.898
100.12%
93.634
104.29%
93.342
103.96%
I32
I8
2^28 = 268435456
1151
1152
100.09%
1228
106.69%
1222
106.17%
I32
I16
2^16 = 65536
10.724
10.8
100.71%
11.165
104.11%
10.934
101.96%
I32
I16
2^20 = 1048576
16.455
16.509
100.33%
16.848
102.39%
16.875
102.55%
I32
I16
2^24 = 16777216
100.294
100.195
99.90%
108.027
107.71%
108.048
107.73%
I32
I16
2^28 = 268435456
1401
1400
99.93%
1446
103.21%
1446
103.21%
I32
I32
2^16 = 65536
11.267
11.242
99.78%
11.423
101.38%
11.448
101.61%
I32
I32
2^20 = 1048576
18.348
18.48
100.72%
18.916
103.10%
18.586
101.30%
I32
I32
2^24 = 16777216
129.095
129.248
100.12%
136.07
105.40%
135.907
105.28%
I32
I32
2^28 = 268435456
1903
1904
100.05%
2044
107.41%
2043
107.36%
I32
I64
2^16 = 65536
11.749
11.85
100.86%
13.584
115.62%
13.575
115.54%
I32
I64
2^20 = 1048576
22.621
22.389
98.97%
27.752
122.68%
27.733
122.60%
I32
I64
2^24 = 16777216
197.436
197.411
99.99%
230.238
116.61%
230.167
116.58%
I32
I64
2^28 = 268435456
2937
2937
100.00%
3478
118.42%
3477
118.39%
I32
I128
2^16 = 65536
16.042
16.083
100.26%
15.575
97.09%
15.558
96.98%
I32
I128
2^20 = 1048576
38.876
39.092
100.56%
41.437
106.59%
41.631
107.09%
I32
I128
2^24 = 16777216
372.169
372.433
100.07%
395.638
106.31%
397.024
106.68%
I32
I128
2^28 = 268435456
5685
5687
100.04%
6082
106.98%
6101
107.32%
I64
I8
2^16 = 65536
10.476
10.437
99.63%
10.896
104.01%
10.877
103.83%
I64
I8
2^20 = 1048576
20.089
20.117
100.14%
20.423
101.66%
20.329
101.19%
I64
I8
2^24 = 16777216
128.927
129.206
100.22%
130.882
101.52%
131.046
101.64%
I64
I8
2^28 = 268435456
1731
1732
100.06%
1747
100.92%
1749
101.04%
I64
I16
2^16 = 65536
11.073
11.138
100.59%
11.432
103.24%
11.365
102.64%
I64
I16
2^20 = 1048576
21.17
21.208
100.18%
21.51
101.61%
22.357
105.61%
I64
I16
2^24 = 16777216
143.401
143.974
100.40%
144.905
101.05%
161.939
112.93%
I64
I16
2^28 = 268435456
2128
2132
100.19%
2135
100.33%
2299
108.04%
I64
I32
2^16 = 65536
11.304
11.222
99.27%
11.419
101.02%
11.289
99.87%
I64
I32
2^20 = 1048576
22.771
22.896
100.55%
22.949
100.78%
22.949
100.78%
I64
I32
2^24 = 16777216
170.41
170.411
100.00%
172.953
101.49%
172.9
101.46%
I64
I32
2^28 = 268435456
2528
2528
100.00%
2586
102.29%
2586
102.29%
I64
I64
2^16 = 65536
12.071
11.682
96.78%
13.468
111.57%
13.545
112.21%
I64
I64
2^20 = 1048576
26.026
25.483
97.91%
32.698
125.64%
32.799
126.02%
I64
I64
2^24 = 16777216
241.63
237.496
98.29%
274.646
113.66%
274.758
113.71%
I64
I64
2^28 = 268435456
3644
3571
98.00%
4188
114.93%
4188
114.93%
I64
I128
2^16 = 65536
15.811
15.962
100.96%
16.165
102.24%
16.484
104.26%
I64
I128
2^20 = 1048576
43.025
43.275
100.58%
44.472
103.36%
44.591
103.64%
I64
I128
2^24 = 16777216
421.234
422.094
100.20%
427.858
101.57%
427.764
101.55%
I64
I128
2^28 = 268435456
6493
6496
100.05%
6592
101.52%
6590
101.49%
I128
I8
2^16 = 65536
11.924
12.062
101.16%
12.055
101.10%
12.03
100.89%
I128
I8
2^20 = 1048576
28.097
28.258
100.57%
28.88
102.79%
29.01
103.25%
I128
I8
2^24 = 16777216
204.094
204.037
99.97%
223.889
109.70%
224.014
109.76%
I128
I8
2^28 = 268435456
3013
3011
99.93%
3273
108.63%
3273
108.63%
I128
I16
2^16 = 65536
11.774
11.791
100.14%
12.181
103.46%
12.259
104.12%
I128
I16
2^20 = 1048576
27.531
27.644
100.41%
30.256
109.90%
30.261
109.92%
I128
I16
2^24 = 16777216
220.072
220.154
100.04%
286.454
130.16%
286.554
130.21%
I128
I16
2^28 = 268435456
3264
3263
99.97%
4292
131.50%
4294
131.56%
I128
I32
2^16 = 65536
11.991
12.005
100.12%
12.303
102.60%
11.974
99.86%
I128
I32
2^20 = 1048576
29.531
29.567
100.12%
29.953
101.43%
29.941
101.39%
I128
I32
2^24 = 16777216
256.994
257.008
100.01%
257.249
100.10%
257.51
100.20%
I128
I32
2^28 = 268435456
3874
3875
100.03%
3877
100.08%
3882
100.21%
I128
I64
2^16 = 65536
12.453
12.52
100.54%
14.781
118.69%
14.398
115.62%
I128
I64
2^20 = 1048576
34.415
34.418
100.01%
41.487
120.55%
41.383
120.25%
I128
I64
2^24 = 16777216
321.329
321.324
100.00%
364.538
113.45%
364.233
113.35%
I128
I64
2^28 = 268435456
4919
4920
100.02%
5632
114.49%
5626
114.37%
I128
I128
2^16 = 65536
16.962
16.815
99.13%
17.058
100.57%
17.051
100.52%
I128
I128
2^20 = 1048576
49.078
48.766
99.36%
49.328
100.51%
49.362
100.58%
I128
I128
2^24 = 16777216
494.191
493.485
99.86%
495.389
100.24%
494.93
100.15%
I128
I128
2^28 = 268435456
7616
7609
99.91%
7631
100.20%
7628
100.16%
F32
I8
2^16 = 65536
10.01
10.018
100.08%
10.311
103.01%
10.226
102.16%
F32
I8
2^20 = 1048576
15.637
15.766
100.82%
15.987
102.24%
16.075
102.80%
F32
I8
2^24 = 16777216
90.437
90.606
100.19%
94.074
104.02%
93.835
103.76%
F32
I8
2^28 = 268435456
1160
1162
100.17%
1235
106.47%
1229
105.95%
F32
I16
2^16 = 65536
10.873
10.908
100.32%
11.113
102.21%
11.053
101.66%
F32
I16
2^20 = 1048576
16.481
16.829
102.11%
17.268
104.78%
17.231
104.55%
F32
I16
2^24 = 16777216
100.243
100.202
99.96%
118.08
117.79%
118.123
117.84%
F32
I16
2^28 = 268435456
1402
1400
99.86%
1561
111.34%
1562
111.41%
F32
I32
2^16 = 65536
11.224
11.193
99.72%
11.439
101.92%
11.449
102.00%
F32
I32
2^20 = 1048576
18.456
18.508
100.28%
18.907
102.44%
18.679
101.21%
F32
I32
2^24 = 16777216
129.388
129.414
100.02%
136.287
105.33%
136.065
105.16%
F32
I32
2^28 = 268435456
1906
1906
100.00%
2046
107.35%
2045
107.29%
F32
I64
2^16 = 65536
11.476
11.746
102.35%
13.369
116.50%
13.303
115.92%
F32
I64
2^20 = 1048576
22.167
22.174
100.03%
27.492
124.02%
27.588
124.46%
F32
I64
2^24 = 16777216
197.273
197.355
100.04%
230.205
116.69%
230.098
116.64%
F32
I64
2^28 = 268435456
2938
2939
100.03%
3481
118.48%
3479
118.41%
F32
I128
2^16 = 65536
15.786
15.551
98.51%
15.335
97.14%
15.209
96.34%
F32
I128
2^20 = 1048576
38.731
38.839
100.28%
41.108
106.14%
41.236
106.47%
F32
I128
2^24 = 16777216
372.851
372.969
100.03%
396.325
106.30%
396.85
106.44%
F32
I128
2^28 = 268435456
5694
5692
99.96%
6089
106.94%
6101
107.15%
F64
I8
2^16 = 65536
10.333
10.44
101.04%
10.601
102.59%
10.578
102.37%
F64
I8
2^20 = 1048576
19.716
20.031
101.60%
20.831
105.66%
20.091
101.90%
F64
I8
2^24 = 16777216
127.413
127.423
100.01%
131.051
102.86%
130.267
102.24%
F64
I8
2^28 = 268435456
1721
1721
100.00%
1734
100.76%
1731
100.58%
F64
I16
2^16 = 65536
10.895
10.911
100.15%
11.085
101.74%
11.121
102.07%
F64
I16
2^20 = 1048576
19.946
20.095
100.75%
22.012
110.36%
21.364
107.11%
F64
I16
2^24 = 16777216
137.948
137.941
99.99%
160.902
116.64%
144.365
104.65%
F64
I16
2^28 = 268435456
2035
2035
100.00%
2295
112.78%
2133
104.82%
F64
I32
2^16 = 65536
10.794
10.83
100.33%
11.078
102.63%
11.019
102.08%
F64
I32
2^20 = 1048576
22.374
22.355
99.92%
22.493
100.53%
22.509
100.60%
F64
I32
2^24 = 16777216
170.073
170.329
100.15%
172.606
101.49%
172.667
101.53%
F64
I32
2^28 = 268435456
2526
2525
99.96%
2583
102.26%
2584
102.30%
F64
I64
2^16 = 65536
11.606
11.342
97.73%
13.109
112.95%
13.707
118.10%
F64
I64
2^20 = 1048576
25.669
27.298
106.35%
32.124
125.15%
33.545
130.68%
F64
I64
2^24 = 16777216
241.189
244.218
101.26%
272.85
113.13%
279.738
115.98%
F64
I64
2^28 = 268435456
3641
3715
102.03%
4177
114.72%
4199
115.33%
F64
I128
2^16 = 65536
15.502
15.755
101.63%
16.087
103.77%
16.008
103.26%
F64
I128
2^20 = 1048576
42.598
43.033
101.02%
44.084
103.49%
44.292
103.98%
F64
I128
2^24 = 16777216
420.54
420.988
100.11%
426.709
101.47%
426.821
101.49%
F64
I128
2^28 = 268435456
6481
6486
100.08%
6578
101.50%
6576
101.47%
C64
I8
2^16 = 65536
10.632
10.696
100.60%
10.63
99.98%
10.733
100.95%
C64
I8
2^20 = 1048576
20.116
20.194
100.39%
21.653
107.64%
20.631
102.56%
C64
I8
2^24 = 16777216
129.099
128.937
99.87%
131.051
101.51%
130.131
100.80%
C64
I8
2^28 = 268435456
1731
1731
100.00%
1730
99.94%
1731
100.00%
C64
I16
2^16 = 65536
11.144
11.178
100.31%
11.437
102.63%
11.368
102.01%
C64
I16
2^20 = 1048576
20.543
20.479
99.69%
21.509
104.70%
22.413
109.10%
C64
I16
2^24 = 16777216
138.223
138.321
100.07%
144.949
104.87%
162.12
117.29%
C64
I16
2^28 = 268435456
2038
2038
100.00%
2137
104.86%
2301
112.90%
C64
I32
2^16 = 65536
11.346
11.251
99.16%
11.397
100.45%
11.383
100.33%
C64
I32
2^20 = 1048576
22.562
22.727
100.73%
22.861
101.33%
22.787
101.00%
C64
I32
2^24 = 16777216
170.436
170.499
100.04%
172.914
101.45%
172.916
101.46%
C64
I32
2^28 = 268435456
2528
2528
100.00%
2586
102.29%
2586
102.29%
C64
I64
2^16 = 65536
11.656
11.707
100.44%
13.545
116.21%
13.565
116.38%
C64
I64
2^20 = 1048576
25.454
25.372
99.68%
32.804
128.88%
33.026
129.75%
C64
I64
2^24 = 16777216
237.429
237.381
99.98%
276.199
116.33%
276.845
116.60%
C64
I64
2^28 = 268435456
3570
3571
100.03%
4194
117.48%
4200
117.65%
C64
I128
2^16 = 65536
15.752
15.865
100.72%
16.34
103.73%
15.994
101.54%
C64
I128
2^20 = 1048576
43.371
43.177
99.55%
44.374
102.31%
44.2
101.91%
C64
I128
2^24 = 16777216
421.998
421.845
99.96%
429.056
101.67%
427.997
101.42%
C64
I128
2^28 = 268435456
device_select_flagged
Results
T{ct}
IsInPlace{ct}
Elements{io}
Entropy
I32
u32
u32/i32 time
i64
i64/i32 time
u64
u64/i32 time
I8
FALSE
2^16 = 65536
1
9.995
10.418
104.23%
9.296
93.01%
9.905
99.10%
I8
FALSE
2^20 = 1048576
1
12.702
12.88
101.40%
13.763
108.35%
14.044
110.57%
I8
FALSE
2^24 = 16777216
1
60.62
60.536
99.86%
77.011
127.04%
77.395
127.67%
I8
FALSE
2^28 = 268435456
1
792.492
788.481
99.49%
1096
138.30%
1096
138.30%
I8
FALSE
2^16 = 65536
0.544
9.879
10.329
104.56%
9.201
93.14%
9.762
98.82%
I8
FALSE
2^20 = 1048576
0.544
12.48
12.797
102.54%
13.524
108.37%
13.894
111.33%
I8
FALSE
2^24 = 16777216
0.544
58.75
59.06
100.53%
75.243
128.07%
75.436
128.40%
I8
FALSE
2^28 = 268435456
0.544
764.061
763.908
99.98%
1063
139.13%
1063
139.13%
I8
FALSE
2^16 = 65536
0
9.628
10.18
105.73%
9.088
94.39%
9.159
95.13%
I8
FALSE
2^20 = 1048576
0
12.057
12.345
102.39%
13.783
114.32%
13.394
111.09%
I8
FALSE
2^24 = 16777216
0
52.13
51.971
99.69%
69.871
134.03%
69.613
133.54%
I8
FALSE
2^28 = 268435456
0
629.964
626.662
99.48%
950.144
150.83%
949.408
150.71%
I8
TRUE
2^16 = 65536
1
10.953
11.104
101.38%
10.385
94.81%
10.351
94.50%
I8
TRUE
2^20 = 1048576
1
13.73
13.715
99.89%
16.256
118.40%
15.897
115.78%
I8
TRUE
2^24 = 16777216
1
67.152
67.074
99.88%
103.498
154.12%
103.254
153.76%
I8
TRUE
2^28 = 268435456
1
916.669
910.642
99.34%
1531
167.02%
1531
167.02%
I8
TRUE
2^16 = 65536
0.544
11.06
10.677
96.54%
10.391
93.95%
10.142
91.70%
I8
TRUE
2^20 = 1048576
0.544
13.832
13.625
98.50%
16.04
115.96%
15.671
113.30%
I8
TRUE
2^24 = 16777216
0.544
65.984
65.799
99.72%
101.356
153.61%
101.077
153.18%
I8
TRUE
2^28 = 268435456
0.544
887.126
886.643
99.95%
1494
168.41%
1494
168.41%
I8
TRUE
2^16 = 65536
0
10.915
10.488
96.09%
10.091
92.45%
9.941
91.08%
I8
TRUE
2^20 = 1048576
0
13.441
13.252
98.59%
16.05
119.41%
15.69
116.73%
I8
TRUE
2^24 = 16777216
0
59.633
59.23
99.32%
95.516
160.17%
95.341
159.88%
I8
TRUE
2^28 = 268435456
0
757.158
754.197
99.61%
1380
182.26%
1380
182.26%
I16
FALSE
2^16 = 65536
1
10.589
10.877
102.72%
9.598
90.64%
9.529
89.99%
I16
FALSE
2^20 = 1048576
1
13.842
13.736
99.23%
14.541
105.05%
14.236
102.85%
I16
FALSE
2^24 = 16777216
1
77.725
77.586
99.82%
83.402
107.30%
83.178
107.02%
I16
FALSE
2^28 = 268435456
1
989.541
989.898
100.04%
1166
117.83%
1166
117.83%
I16
FALSE
2^16 = 65536
0.544
10.613
10.351
97.53%
9.685
91.26%
9.475
89.28%
I16
FALSE
2^20 = 1048576
0.544
14.054
13.832
98.42%
14.463
102.91%
14.13
100.54%
I16
FALSE
2^24 = 16777216
0.544
72.557
72.084
99.35%
80.977
111.60%
80.598
111.08%
I16
FALSE
2^28 = 268435456
0.544
907.356
905.404
99.78%
1133
124.87%
1132
124.76%
I16
FALSE
2^16 = 65536
0
10.129
9.929
98.03%
9.414
92.94%
9.196
90.79%
I16
FALSE
2^20 = 1048576
0
13.7
13.235
96.61%
14.139
103.20%
13.804
100.76%
I16
FALSE
2^24 = 16777216
0
64.863
64.404
99.29%
74.501
114.86%
74.266
114.50%
I16
FALSE
2^28 = 268435456
0
683.506
681.977
99.78%
984.721
144.07%
984.309
144.01%
I16
TRUE
2^16 = 65536
1
11.395
11.02
96.71%
10.361
90.93%
10.274
90.16%
I16
TRUE
2^20 = 1048576
1
15.223
15.025
98.70%
16.78
110.23%
16.405
107.76%
I16
TRUE
2^24 = 16777216
1
82.89
82.844
99.94%
110.122
132.85%
109.795
132.46%
I16
TRUE
2^28 = 268435456
1
1082
1084
100.18%
1620
149.72%
1619
149.63%
I16
TRUE
2^16 = 65536
0.544
11.329
11.059
97.62%
10.515
92.81%
10.201
90.04%
I16
TRUE
2^20 = 1048576
0.544
15.118
14.849
98.22%
16.625
109.97%
16.639
110.06%
I16
TRUE
2^24 = 16777216
0.544
78.322
77.925
99.49%
107.371
137.09%
107.354
137.07%
I16
TRUE
2^28 = 268435456
0.544
1021
1021
100.00%
1580
154.75%
1580
154.75%
I16
TRUE
2^16 = 65536
0
11.023
10.558
95.78%
10.025
90.95%
10.188
92.42%
I16
TRUE
2^20 = 1048576
0
14.565
14.311
98.26%
16.133
110.77%
16.424
112.76%
I16
TRUE
2^24 = 16777216
0
69.429
69.019
99.41%
100.17
144.28%
100.298
144.46%
I16
TRUE
2^28 = 268435456
0
820.309
819.437
99.89%
1420
173.11%
1420
173.11%
I32
FALSE
2^16 = 65536
1
10.112
10.135
100.23%
9.77
96.62%
9.695
95.88%
I32
FALSE
2^20 = 1048576
1
15.672
15.314
97.72%
16.033
102.30%
15.757
100.54%
I32
FALSE
2^24 = 16777216
1
104.585
104.221
99.65%
109.972
105.15%
109.806
104.99%
I32
FALSE
2^28 = 268435456
1
1516
1516
100.00%
1625
107.19%
1625
107.19%
I32
FALSE
2^16 = 65536
0.544
10.246
10.015
97.75%
9.732
94.98%
9.509
92.81%
I32
FALSE
2^20 = 1048576
0.544
15.771
15.206
96.42%
15.964
101.22%
15.502
98.29%
I32
FALSE
2^24 = 16777216
0.544
93.649
93.243
99.57%
98.324
104.99%
97.967
104.61%
I32
FALSE
2^28 = 268435456
0.544
1267
1267
100.00%
1358
107.18%
1358
107.18%
I32
FALSE
2^16 = 65536
0
9.66
9.541
98.77%
9.822
101.68%
9.787
101.31%
I32
FALSE
2^20 = 1048576
0
15.187
14.599
96.13%
15.234
100.31%
15.613
102.81%
I32
FALSE
2^24 = 16777216
0
78.986
78.558
99.46%
84.244
106.66%
84.49
106.97%
I32
FALSE
2^28 = 268435456
0
849.659
849.495
99.98%
1054
124.05%
1055
124.17%
I32
TRUE
2^16 = 65536
1
10.962
11.097
101.23%
10.425
95.10%
10.702
97.63%
I32
TRUE
2^20 = 1048576
1
17.145
17.019
99.27%
17.912
104.47%
18.234
106.35%
I32
TRUE
2^24 = 16777216
1
110.155
110.144
99.99%
130.047
118.06%
130.211
118.21%
I32
TRUE
2^28 = 268435456
1
1589
1589
100.00%
1906
119.95%
1907
120.01%
I32
TRUE
2^16 = 65536
0.544
10.684
11.006
103.01%
10.296
96.37%
10.333
96.71%
I32
TRUE
2^20 = 1048576
0.544
16.67
16.897
101.36%
17.654
105.90%
18.026
108.13%
I32
TRUE
2^24 = 16777216
0.544
99.444
99.745
100.30%
121.45
122.13%
121.81
122.49%
I32
TRUE
2^28 = 268435456
0.544
1330
1330
100.00%
1757
132.11%
1758
132.18%
I32
TRUE
2^16 = 65536
0
10.111
10.421
103.07%
10.255
101.42%
10.258
101.45%
I32
TRUE
2^20 = 1048576
0
15.683
16.06
102.40%
17.414
111.04%
17.773
113.33%
I32
TRUE
2^24 = 16777216
0
84.553
84.825
100.32%
108.812
128.69%
109.119
129.05%
I32
TRUE
2^28 = 268435456
0
966.425
966.334
99.99%
1498
155.00%
1499
155.11%
I64
FALSE
2^16 = 65536
1
10.445
10.184
97.50%
10.246
98.09%
10
95.74%
I64
FALSE
2^20 = 1048576
1
20.121
19.964
99.22%
20.442
101.60%
20.044
99.62%
I64
FALSE
2^24 = 16777216
1
180.796
181.002
100.11%
191.234
105.77%
191.101
105.70%
I64
FALSE
2^28 = 268435456
1
2744
2747
100.11%
2954
107.65%
2954
107.65%
I64
FALSE
2^16 = 65536
0.544
10.174
10.651
104.69%
10.187
100.13%
9.777
96.10%
I64
FALSE
2^20 = 1048576
0.544
19.672
19.795
100.63%
20.188
102.62%
19.834
100.82%
I64
FALSE
2^24 = 16777216
0.544
152.102
152.387
100.19%
161.845
106.41%
161.532
106.20%
I64
FALSE
2^28 = 268435456
0.544
2246
2246
100.00%
2419
107.70%
2419
107.70%
I64
FALSE
2^16 = 65536
0
9.368
9.909
105.77%
10.048
107.26%
9.707
103.62%
I64
FALSE
2^20 = 1048576
0
18.443
18.468
100.14%
19.652
106.56%
19.261
104.44%
I64
FALSE
2^24 = 16777216
0
112.768
112.864
100.09%
126.92
112.55%
126.679
112.34%
I64
FALSE
2^28 = 268435456
0
1384
1384
100.00%
1690
122.11%
1690
122.11%
I64
TRUE
2^16 = 65536
1
11.035
11.239
101.85%
11.371
103.04%
10.848
98.31%
I64
TRUE
2^20 = 1048576
1
21.012
21.313
101.43%
23.429
111.50%
23.181
110.32%
I64
TRUE
2^24 = 16777216
1
186.174
186.436
100.14%
219.016
117.64%
218.728
117.49%
I64
TRUE
2^28 = 268435456
1
2841
2841
100.00%
3324
117.00%
3326
117.07%
I64
TRUE
2^16 = 65536
0.544
11.062
11.354
102.64%
11.17
100.98%
10.738
97.07%
I64
TRUE
2^20 = 1048576
0.544
20.767
21.096
101.58%
23.228
111.85%
23.295
112.17%
I64
TRUE
2^24 = 16777216
0.544
158.215
158.444
100.14%
195.572
123.61%
195.6
123.63%
I64
TRUE
2^28 = 268435456
0.544
2355
2354
99.96%
2908
123.48%
2908
123.48%
I64
TRUE
2^16 = 65536
0
10.386
10.692
102.95%
10.847
104.44%
11.128
107.14%
I64
TRUE
2^20 = 1048576
0
19.307
19.884
102.99%
22.178
114.87%
22.674
117.44%
I64
TRUE
2^24 = 16777216
0
120.013
120.452
100.37%
168.491
140.39%
168.888
140.72%
I64
TRUE
2^28 = 268435456
0
1502
1502
100.00%
2428
161.65%
2428
161.65%
I128
FALSE
2^16 = 65536
1
10.824
10.612
98.04%
10.96
101.26%
11.33
104.67%
I128
FALSE
2^20 = 1048576
1
30.471
30.217
99.17%
33.423
109.69%
33.555
110.12%
I128
FALSE
2^24 = 16777216
1
346.419
345.94
99.86%
381.333
110.08%
380.909
109.96%
I128
FALSE
2^28 = 268435456
1
5405
5406
100.02%
6023
111.43%
6007
111.14%
I128
FALSE
2^16 = 65536
0.544
10.782
10.672
98.98%
10.965
101.70%
11.478
106.46%
I128
FALSE
2^20 = 1048576
0.544
27.85
27.636
99.23%
30.877
110.87%
31.098
111.66%
I128
FALSE
2^24 = 16777216
0.544
278.74
279.077
100.12%
315.154
113.06%
314.729
112.91%
I128
FALSE
2^28 = 268435456
0.544
4275
4280
100.12%
4862
113.73%
4862
113.73%
I128
FALSE
2^16 = 65536
0
10.473
10.198
97.37%
10.741
102.56%
10.989
104.93%
I128
FALSE
2^20 = 1048576
0
25.592
25.364
99.11%
28.565
111.62%
28.4
110.97%
I128
FALSE
2^24 = 16777216
0
184.311
185.224
100.50%
253.648
137.62%
253.715
137.66%
I128
FALSE
2^28 = 268435456
0
2589
2592
100.12%
3791
146.43%
3791
146.43%
I128
TRUE
2^16 = 65536
1
11.714
11.472
97.93%
12.611
107.66%
12.351
105.44%
I128
TRUE
2^20 = 1048576
1
31.781
31.491
99.09%
39.235
123.45%
38.88
122.34%
I128
TRUE
2^24 = 16777216
1
351.823
352.084
100.07%
458.808
130.41%
458.459
130.31%
I128
TRUE
2^28 = 268435456
1
5481
5493
100.22%
7163
130.69%
7165
130.72%
I128
TRUE
2^16 = 65536
0.544
11.422
11.209
98.14%
12.591
110.23%
12.335
107.99%
I128
TRUE
2^20 = 1048576
0.544
29.359
29.384
100.09%
37.618
128.13%
37.19
126.67%
I128
TRUE
2^24 = 16777216
0.544
290.386
290.97
100.20%
418.685
144.18%
418.244
144.03%
I128
TRUE
2^28 = 268435456
0.544
4481
4490
100.20%
6522
145.55%
6522
145.55%
I128
TRUE
2^16 = 65536
0
11.003
10.779
97.96%
12.596
114.48%
12.276
111.57%
I128
TRUE
2^20 = 1048576
0
27.196
27.277
100.30%
35.527
130.63%
35.246
129.60%
I128
TRUE
2^24 = 16777216
0
206.855
208.214
100.66%
366.605
177.23%
366.4
177.13%
I128
TRUE
2^28 = 268435456
0
2949
2971
100.75%
5634
191.05%
5640
191.25%
F32
FALSE
2^16 = 65536
1
10.093
10.377
102.81%
9.478
93.91%
9.986
98.94%
F32
FALSE
2^20 = 1048576
1
15.172
15.534
102.39%
15.61
102.89%
16.104
106.14%
F32
FALSE
2^24 = 16777216
1
104.098
104.33
100.22%
109.627
105.31%
109.917
105.59%
F32
FALSE
2^28 = 268435456
1
1513
1513
100.00%
1622
107.20%
1624
107.34%
F32
FALSE
2^16 = 65536
0.544
10.33
10.279
99.51%
9.313
90.15%
9.881
95.65%
F32
FALSE
2^20 = 1048576
0.544
15.462
15.674
101.37%
15.461
99.99%
15.742
101.81%
F32
FALSE
2^24 = 16777216
0.544
93.507
93.616
100.12%
97.712
104.50%
98.192
105.01%
F32
FALSE
2^28 = 268435456
0.544
1265
1265
100.00%
1355
107.11%
1356
107.19%
F32
FALSE
2^16 = 65536
0
9.931
9.62
96.87%
9.194
92.58%
9.653
97.20%
F32
FALSE
2^20 = 1048576
0
14.843
14.999
101.05%
15.156
102.11%
15.63
105.30%
F32
FALSE
2^24 = 16777216
0
78.724
78.722
100.00%
84.214
106.97%
84.52
107.36%
F32
FALSE
2^28 = 268435456
0
848.093
848.322
100.03%
1053
124.16%
1051
123.93%
F32
TRUE
2^16 = 65536
1
11.144
10.866
97.51%
10.397
93.30%
10.753
96.49%
F32
TRUE
2^20 = 1048576
1
16.704
16.86
100.93%
17.822
106.69%
18.232
109.15%
F32
TRUE
2^24 = 16777216
1
109.849
109.861
100.01%
129.661
118.04%
129.98
118.33%
F32
TRUE
2^28 = 268435456
1
1587
1587
100.00%
1901
119.79%
1903
119.91%
F32
TRUE
2^16 = 65536
0.544
11.007
10.878
98.83%
10.224
92.89%
10.345
93.99%
F32
TRUE
2^20 = 1048576
0.544
16.655
16.952
101.78%
17.595
105.64%
17.68
106.15%
F32
TRUE
2^24 = 16777216
0.544
99.531
99.614
100.08%
121.195
121.77%
121.47
122.04%
F32
TRUE
2^28 = 268435456
0.544
1328
1328
100.00%
1747
131.55%
1753
132.00%
F32
TRUE
2^16 = 65536
0
10.582
10.381
98.10%
10.442
98.68%
9.941
93.94%
F32
TRUE
2^20 = 1048576
0
15.909
16.015
100.67%
17.738
111.50%
17.384
109.27%
F32
TRUE
2^24 = 16777216
0
84.593
84.606
100.02%
108.872
128.70%
108.731
128.53%
F32
TRUE
2^28 = 268435456
0
964.432
964.378
99.99%
1487
154.18%
1493
154.81%
F64
FALSE
2^16 = 65536
1
10.024
10.402
103.77%
9.931
99.07%
10.299
102.74%
F64
FALSE
2^20 = 1048576
1
19.88
20.085
101.03%
20.03
100.75%
20.243
101.83%
F64
FALSE
2^24 = 16777216
1
180.422
181.02
100.33%
190.75
105.72%
190.869
105.79%
F64
FALSE
2^28 = 268435456
1
2744
2748
100.15%
2955
107.69%
2950
107.51%
F64
FALSE
2^16 = 65536
0.544
10.084
10.477
103.90%
9.967
98.84%
10.124
100.40%
F64
FALSE
2^20 = 1048576
0.544
19.411
20.162
103.87%
19.829
102.15%
20.06
103.34%
F64
FALSE
2^24 = 16777216
0.544
151.986
152.496
100.34%
161.334
106.15%
161.352
106.16%
F64
FALSE
2^28 = 268435456
0.544
2246
2246
100.00%
2416
107.57%
2413
107.44%
F64
FALSE
2^16 = 65536
0
9.59
9.659
100.72%
9.887
103.10%
9.709
101.24%
F64
FALSE
2^20 = 1048576
0
18.164
18.76
103.28%
19.428
106.96%
19.117
105.25%
F64
FALSE
2^24 = 16777216
0
112.634
113.08
100.40%
126.29
112.12%
126.039
111.90%
F64
FALSE
2^28 = 268435456
0
1383
1383
100.00%
1682
121.62%
1680
121.48%
F64
TRUE
2^16 = 65536
1
11.058
10.96
99.11%
11.132
100.67%
10.849
98.11%
F64
TRUE
2^20 = 1048576
1
21.084
21.173
100.42%
23.453
111.24%
23.031
109.23%
F64
TRUE
2^24 = 16777216
1
186.077
186.193
100.06%
218.024
117.17%
217.733
117.01%
F64
TRUE
2^28 = 268435456
1
2842
2840
99.93%
3313
116.57%
3313
116.57%
F64
TRUE
2^16 = 65536
0.544
11.339
10.842
95.62%
11.038
97.35%
10.729
94.62%
F64
TRUE
2^20 = 1048576
0.544
21.317
20.955
98.30%
23.042
108.09%
22.786
106.89%
F64
TRUE
2^24 = 16777216
0.544
158.347
158.071
99.83%
194.517
122.84%
194.228
122.66%
F64
TRUE
2^28 = 268435456
0.544
2353
2352
99.96%
2891
122.86%
2891
122.86%
F64
TRUE
2^16 = 65536
0
10.407
10.11
97.15%
11.129
106.94%
10.583
101.69%
F64
TRUE
2^20 = 1048576
0
19.943
19.746
99.01%
22.51
112.87%
22.179
111.21%
F64
TRUE
2^24 = 16777216
0
120.405
120.086
99.74%
167.749
139.32%
167.342
138.98%
F64
TRUE
2^28 = 268435456
0
1500
1500
100.00%
2410
160.67%
2410
160.67%
device_select_if
Results
T{ct}
IsInPlace{ct}
Elements{io}
Entropy
I32
u32
u32/i32 time
i64
i64/i32 time
u64
u64/i32 time
I8
FALSE
2^16 = 65536
1
9.593
9.286
96.80%
9.412
98.11%
9.229
96.21%
I8
FALSE
2^20 = 1048576
1
12.372
11.839
95.69%
13.272
107.27%
12.927
104.49%
I8
FALSE
2^24 = 16777216
1
51.753
50.355
97.30%
71.368
137.90%
71.033
137.25%
I8
FALSE
2^28 = 268435456
1
694.494
674.973
97.19%
1038
149.46%
1037
149.32%
I8
FALSE
2^16 = 65536
0.544
9.579
9.489
99.06%
9.539
99.58%
9.426
98.40%
I8
FALSE
2^20 = 1048576
0.544
12.117
11.601
95.74%
12.919
106.62%
12.699
104.80%
I8
FALSE
2^24 = 16777216
0.544
48.466
47.774
98.57%
68.568
141.48%
68.269
140.86%
I8
FALSE
2^28 = 268435456
0.544
639.299
633.134
99.04%
987.549
154.47%
987.764
154.51%
I8
FALSE
2^16 = 65536
0
9.351
9.385
100.36%
9.409
100.62%
9.796
104.76%
I8
FALSE
2^20 = 1048576
0
11.646
11.431
98.15%
12.485
107.20%
12.878
110.58%
I8
FALSE
2^24 = 16777216
0
42.488
42.102
99.09%
63.775
150.10%
64.159
151.00%
I8
FALSE
2^28 = 268435456
0
526.733
524.933
99.66%
888.127
168.61%
888.236
168.63%
I8
TRUE
2^16 = 65536
1
10.381
10.382
100.01%
9.85
94.88%
10.322
99.43%
I8
TRUE
2^20 = 1048576
1
13.477
13.688
101.57%
15.215
112.90%
15.423
114.44%
I8
TRUE
2^24 = 16777216
1
61.16
59.945
98.01%
96.48
157.75%
96.627
157.99%
I8
TRUE
2^28 = 268435456
1
857.119
839.771
97.98%
1472
171.74%
1472
171.74%
I8
TRUE
2^16 = 65536
0.544
10.29
10.334
100.43%
9.981
97.00%
10.312
100.21%
I8
TRUE
2^20 = 1048576
0.544
12.947
13.39
103.42%
15.046
116.21%
15.182
117.26%
I8
TRUE
2^24 = 16777216
0.544
57.827
57.808
99.97%
93.575
161.82%
94.154
162.82%
I8
TRUE
2^28 = 268435456
0.544
802.315
798.158
99.48%
1417
176.61%
1417
176.61%
I8
TRUE
2^16 = 65536
0
10.051
10.256
102.04%
9.943
98.93%
10.503
104.50%
I8
TRUE
2^20 = 1048576
0
12.533
13.033
103.99%
14.786
117.98%
15.076
120.29%
I8
TRUE
2^24 = 16777216
0
52.096
52.376
100.54%
89.2
171.22%
89.499
171.80%
I8
TRUE
2^28 = 268435456
0
693.302
692.789
99.93%
1317
189.96%
1317
189.96%
I16
FALSE
2^16 = 65536
1
10.11
9.802
96.95%
9.41
93.08%
9.639
95.34%
I16
FALSE
2^20 = 1048576
1
13.546
13.262
97.90%
13.727
101.34%
13.61
100.47%
I16
FALSE
2^24 = 16777216
1
63.446
63.064
99.40%
77.355
121.92%
77.197
121.67%
I16
FALSE
2^28 = 268435456
1
883.8
875.282
99.04%
1109
125.48%
1109
125.48%
I16
FALSE
2^16 = 65536
0.544
9.468
10.04
106.04%
9.377
99.04%
9.528
100.63%
I16
FALSE
2^20 = 1048576
0.544
12.542
12.99
103.57%
13.45
107.24%
13.549
108.03%
I16
FALSE
2^24 = 16777216
0.544
56.02
56.201
100.32%
74.915
133.73%
75.007
133.89%
I16
FALSE
2^28 = 268435456
0.544
704.52
699.244
99.25%
1071
152.02%
1071
152.02%
I16
FALSE
2^16 = 65536
0
9.199
9.548
103.79%
9.205
100.07%
9.386
The text was updated successfully, but these errors were encountered:
In order to make an informed decision about the offset type solution, we would like to have a complete understanding of the impact on each CUB algorithm. Therefore, for each algorithm, we'd like a table that summarizes the following:
Algorithms
device_exclusive_scan_max
,device_exclusive_scan_sum
,device_exclusive_scan_by_key
,device_select_flagged
device_select_if
,device_partition_if
,device_partition_flagged
,device_run_length_encode
,device_run_length_encode_non_trivial_runs
,device_segmented_sort
,System & environment
Benchmark data
device_exclusive_scan_max
Results
device_exclusive_scan_sum
Results
device_exclusive_scan_by_key
Results
device_select_flagged
Results
device_select_if
Results
The text was updated successfully, but these errors were encountered: