[EPIC] Consolidate kernels between Thrust and CUB #26
Labels
cub
For all items related to CUB
feature request
New feature or request.
thrust
For all items related to Thrust.
Below is a list of tasks in prioritized order. We should start with algorithms that already exist in CUB. This will allow delivering CUB optimizations into Thrust sooner.
Document procedure
Replace Thrust Algorithms with CUB
thrust::unique_by_key
to usecub::DeviceSelect::UniqueByKey
#1210thrust::copy_if
to usecub::DeviceSelect
#1263thrust::partition
to usecub::DevicePartition
#1383thrust::partition_copy
andthrust::stable_partition_copy
to useDevicePartition
#1397thrust::unique
to usecub::DeviceSelect::Unique
#1625thrust::reduce
to usecub::DeviceReduce
#1626thrust::reduce_by_key
to usecub::DeviceReduce::ReduceByKey
thrust/extrema.h
to usecub::DeviceReduce
Device*
interface (i.e., to make sure we use the index type that is optimized for the cub algorithm) - or alternatively go viaDispatch*
interface but make sure to use the right offset typePort Thrust Algorithms into CUB
thrust::cuda_cub::parallel_for
to CUB #1231thrust::merge
to CUB #1763thrust/thrust/system/cuda/detail/set_operations.h
to CUBA few notes:
thrust::partition_copy
andthrust::stable_partition_copy
require taking two separate/distinct output iterators: one for the selected and one for the rejected items.DevicePartition
, however, currently only supports a single output iterator, where the selected items are written to the beginning in order and the rejected items are written to the end in reverse order, respectively. Supporting the these two thrust algorithms requires extendingAgentSelectIf
, implementing overloads for methods likeScatterTwoPhase
that are concerned with writing rejected items to the output iterators.::Flagged
version along with a transform iterator.thrust::reduce_by_key
we need to decide on accumulator type.The text was updated successfully, but these errors were encountered: