OFA
OFA样例

又是瞎折腾的一天…

一、Linux下搭建

按照要求搭建即可,没啥特殊的情况,注意把需要的库下载完整。

但是我的Linux因为种种原因挂不上N卡,我就换Windows了。

二、Windows下搭建

(1) torch.cuda.is_available()为False

需要安装Nvidia Toolkit,并且安装特定版本的torch即可。

(2) 运行时报错,报错中显示让使用freeze_support()

1
2
3
4
5
6
7
8
9
10
11
RuntimeError: 
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.

这个是因为Windows和Linux下mutilprocessing实现不同的问题,可以参考下面几个帖子:

Recipe-Multiprocessing

The “freeze_support()“ line can be omitted if the program

python运行子进程时报错:The “freeze_support()“ line can be omitted if the program is not going to be froze

PyTorch num_workers, a tip for speedy training

最后,我的解决方案就是,根据报错修改OFA库源码内涉及设置num_works的,改为0,就没有报错了(但是存在负面影响,具体可以看上面的链接)。