We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multirun模式下,实现多任务之间有启动间隔的并行启动
在神经网络训练时,需要将模型、数据集加载到gpu上。 我们通过排序gpu的空闲资源来选择gpu。 如果并行任务同时启动,因为gpu加载需要时间,同一时间运行的排序函数有很大概率会选择同一个gpu,这样可能会产生内存溢出。
The text was updated successfully, but these errors were encountered:
No branches or pull requests
🚀 Feature Request
multirun模式下,实现多任务之间有启动间隔的并行启动
Motivation
在神经网络训练时,需要将模型、数据集加载到gpu上。
我们通过排序gpu的空闲资源来选择gpu。
如果并行任务同时启动,因为gpu加载需要时间,同一时间运行的排序函数有很大概率会选择同一个gpu,这样可能会产生内存溢出。
The text was updated successfully, but these errors were encountered: