You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to train the model from scratch on a custom subset of Imagenet, the training works fine on a single gpu, but when running on multiple gpus I get the following error:
Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 3 does not equal 0 (while checking arguments for cudnn_batch_norm)
I'm trying to train the model from scratch on a custom subset of Imagenet, the training works fine on a single gpu, but when running on multiple gpus I get the following error:
my configuration file looks like this:
name: train_colorformer
model_type: LABGANRGBModel
scale: 1
num_gpu: 4
manual_seed: 0
queue_size: 64
and I'm using CUDA_VISIBLE_DEVICES to specify the gpus to be used.
I tried looking for any inputs that are not moved to cuda but without success.
The text was updated successfully, but these errors were encountered: