目前,我正在尝试复制一个De模糊GanV2网络。目前,我正在执行培训。以下是我的培训管道的当前状态:
from torch.autograd import Variable
torch.autograd.set_detect_anomaly(mode=True)
total_generator_loss = 0
total_discriminator_loss = 0
psnr_score = 0.0
used_loss_function = 'wgan_gp_loss'
for epoch in range(n_epochs):
#set to train mode
generator.train(); discriminator.train()
tqdm_bar = tqdm(train_loader, desc=f'Training Epoch {epoch} ', total=int(len(train_loader)))
for batch_idx, imgs in enumerate(tqdm_bar):
#load imgs to cpu
blurred_images = imgs["blurred"].cuda()
sharped_images = imgs["sharp"].cuda()
# generator output
deblurred_img = generator(blurred_images)
# denormalize
with torch.no_grad():
denormalized_blurred = denormalize(blurred_images)
denormalized_sharp = denormalize(sharped_images)
denormalized_deblurred = denormalize(deblurred_img)
# get D's output
sharp_discriminator_out = discriminator(sharped_images)
deblurred_discriminator_out = discriminator(deblurred_img)
# set critic_updates
if used_loss_function== 'wgan_gp_loss':
critic_updates = 5
else:
critic_updates = 1
#train discriminator
discriminator_loss = 0
for i in range(critic_updates):
discriminator_optimizer.zero_grad()
# train discriminator on real and fake
if used_loss_function== 'wgan_gp_loss':
gp_lambda = 10
alpha = random.random()
interpolates = alpha * sharped_images + (1 - alpha) * deblurred_img
interpolates_discriminator_out = discriminator(interpolates)
kwargs = {'gp_lambda': gp_lambda,
'interpolates': interpolates,
'interpolates_discriminator_out': interpolates_discriminator_out,
'sharp_discriminator_out': sharp_discriminator_out,
'deblurred_discriminator_out': deblurred_discriminator_out
}
wgan_loss_d, gp_d = wgan_gp_loss('D', **kwargs)
discriminator_loss_per_update = wgan_loss_d + gp_d
discriminator_loss_per_update.backward(retain_graph=True)
discriminator_optimizer.step()
discriminator_loss += discriminator_loss_per_update.item()
但是当我运行此代码时,我会收到以下错误消息:
RuntimeError:梯度计算所需的变量之一已被就地操作修改:[torch. cuda.FloatTensor[1,512,4,4]]处于版本2;而不是预期的版本1。提示:上面的回溯显示了未能计算其梯度的操作。有问题的变量在那里或以后的任何地方都被更改了。祝你好运!
RuntimeError Traceback(最近一次调用)in()62#discriminator_loss_per_update=gan_loss_d63-
1帧 /usr/local/lib/python3.7/dist-packages/torch/tensor.py向后(自我,梯度,retain_graph,create_graph,输入)243create_graph=create_graph,244输入=输入)→245 torch. autograd.back(自我,梯度,retain_graph,create_graph,输入=输入)246 247 defregister_hook(自我,挂钩):
/usr/local/lib/python3.7/dist-packages/torch/autograd/init.py(张量,grad_tensors,retain_graph,create_graph,grad_variables,输入)145变量。execution_engine.run_backward(146张量,grad_tensors,retain_graph,create_graph,输入,→147allow_unreachable=True,accumulate_grad=True)#allow_unreachable标志148 149
不幸的是,我无法真正跟踪会导致此错误的就地操作。有人可能对我有想法或建议吗?我将感谢任何输入:slight_smile:
尝试将最后一行更改为:
discriminator_loss = discriminator_loss + discriminator_loss_per_update.item()