有一段代码在 pytorch 1.2 上没有问题,但是移植到 pytorch 1.8 就会报如下错误:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3136, 10]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
经过检查,代码里并没有用到 inplace 操作。
后来发现这是 pytorch 版本更新造成的,对于 pytorch 1.4 之前的版本,如下代码是不会出错的:
opt1.zero_grad()
loss1.backward()
opt1.step()
opt2.zero_grad()
loss2.backward()
opt2.step()
但是更新到 pytorch 1.5 之后,这种操作就会报错,应该用下面代码代替:
opt1.zero_grad()
loss1.backward()
opt2.zero_grad()
loss2.backward()
opt1.step()
opt2.step()
这个BUG难了我一下午,特此记录一下。
PyTorch Bug 记录:one of the variables needed for gradient computation has been modified by an inplace
有一段代码在 pytorch 1.2 上没有问题,但是移植到 pytorch 1.8 就会报如下错误:RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3136, 10]], which is output 0 of TBackward, is at version 2; expecte
在anaconda 安装pytorch的时候报错:
PackagesNotFoundError: The following packages are not available from current channels:
原因是因为我将安装源设置为清华镜像,而在2019年4月份anaconda的清华镜像不能用了:
所以我们需要手动下载whl文件:可以从下面的文章中直接下载,也可以去pytorch官网下载。
https://blog.csdn.net/qq_27009517/article/details/81484662
下载完成后,在conda里执行:
pip install
PyTorch 反向传播报错:one of the variables needed for gradient computation has been modified by an inplace
from pytorch_revgrad import RevGrad
model = torch . nn . Sequential (
torch . nn . Linear ( 10 , 5 ),
torch . nn . Linear ( 5 , 2 ),
RevGrad ()
【完美解决】RuntimeError: one of the variables needed for gradient computation has been modified by an inp
2.DDP——torch.nn.parallel.DistributedDataParallel
方法1简单,但是这种方式训练有不足之处。方法2要改动的地方比较多,但是速度更快。而且当模型很大的时候使用DataParallel我遇到了一个问题,报错说模型参数不在一个device上,这很有可能是单张卡放不下这些参数,但是具体的原因我也不清楚
报错如下所示:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation:
[torch.FloatTensor [4, 1, 1, 155]], which is output 0 of UnsqueezeBackward0, is at version 1066; expected version 1065 instead. Hint: ena
简单记录一下这个问题的解决。我需要使用tensor A的一些行来替换tensor B的一些行(通过一个表示索引的tensor来指定位置),发现可以使用:torch.index_select和torch.index_add来实现。具体等会来介绍一下,先说说怎么解决这个报错。发现PyTorch的文档主要介绍的是:torch.Tensor.index_add_:
torch.Tensor.index_add_ — PyTorch 1.11.0 documentation
torch.index_add — P
写完了训练的代码,运行了以后发现了这么一个错:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [544, 768]], which is output 0 of ViewBackward, is at version 1; expected version 0 instead.
Hint: en
1.torch.cuda.FloatTensor 与torch.FloatTensor
Pytorch中的tensor又包括CPU上的数据类型和GPU上的数据类型,一般GPU上的Tensor是CPU上的Tensor加cuda()函数得到。
一般系统默认是torch.FloatTensor类型(即CPU上的数据类型)。例如data = torch.Tensor(2,3)是一个2*3的张量,类型为FloatTensor;
data.cuda()就转换为GPU的张量类型,torch.cuda.FloatT.
今天跑网络在进行loss.backward()的时候,出现了如下错误:
one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [2, 64]], which is output 0 of ViewBackward, is at version 21;
expected version 20 instead. Hint: enabl
PyTorch Bug 记录:one of the variables needed for gradient computation has been modified by an inplace
zhi9935:
PyTorch Bug 记录:one of the variables needed for gradient computation has been modified by an inplace
zhi9935:
PyTorch Bug 记录:one of the variables needed for gradient computation has been modified by an inplace
zhi9935:
PyTorch Bug 记录:one of the variables needed for gradient computation has been modified by an inplace
1874: