分布式训练Warning: Grad strides do not match bucket view strides. This may indicate grad was not
warning信息如下:
Warning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance.
由数据转换而导致的Grad strides不匹配问题。
eg1: 若transpose或者permute后面有reshape操作,则其后需加.contiguous()
x.transpose(1, 2).cont
本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!
