对BatchNorm2d的理解

2023-10-04 09:43:33

简介

网上关于BN处理的说明有很多，它的前身就是白化处理，但是白化处理需要消除特征之间的相关性（使用PCA降维，矩阵分解等），这样比较消耗时间，然后对每一维特征的样本进行归一化，就是将其映射到均值为0，标准差为1的空间上。 BN汲取了白化处理的思想，但是BN的特点是进行批量归一化，比如一个形状为[N,C,H,W]的输入数据，它针对每一维特征进行样本大小为NxHxW的归一化操作，将该维特征的样本空间进行一个标准映射。为了验证我的理解，我做了实验一下实验。

准备数据

input1: 一维特征，batch=2的tensor
input2: 一维特征，batch=2的tensor
input: 二维特征，batch=2的tensor

import numpy as np
import torch
m1 = torch.nn.BatchNorm2d(1)
m2 = torch.nn.BatchNorm2d(2)
input1 = torch.randint(5,[2,1,2,2],dtype=torch.float32)
input2 = torch.randint(5,[2,1,2,2],dtype=torch.float32)
input = torch.cat((input1,input2),1)
print('input size: -----------------')
print(input.size())
print('input1: -----------------')
print(input1)
print('input2: -----------------')
print(input2)
print('input: -----------------')
print(input)'''
input size: -----------------
torch.Size([2, 2, 2, 2])
input1: -----------------
tensor([[[[3., 3.],[1., 1.]]],[[[2., 2.],[4., 1.]]]])
input2: -----------------
tensor([[[[2., 1.],[2., 2.]]],[[[1., 1.],[4., 1.]]]])
input: -----------------
tensor([[[[3., 3.],[1., 1.]],[[2., 1.],[2., 2.]]],[[[2., 2.],[4., 1.]],[[1., 1.],[4., 1.]]]])
'''

验证

通过实验发现，使用BN对input1操作的结果out1，对input2操作的结果out2，将out1和out2沿维度特征拼接的结果正好等于BN对input操作的结果。

out1 = m1(input1)
print('均值：'+str(torch.mean(out1)))
print('标准差：'+str(torch.std(out1)))
out2 = m1(input2)
out = m2(input)
print(out1)
print('#'*80+'手动计算')
# torch.std() 与 np.std() 结果不一样，torch.std()计算平均是除以(n-1),而np.std()是除以n
out1_cal = (input1-torch.mean(input1))/torch.tensor(input1.numpy().std())
print(out1_cal)
print('#'*100)
print(out2)
print('#'*80+'手动计算')
out2_cal = (input2-torch.mean(input2))/torch.tensor(input2.numpy().std())
print(out2_cal)
print('#'*100)
print(out)'''
均值：tensor(1.4901e-08, grad_fn=)
标准差：tensor(1.0690, grad_fn=)
tensor([[[[ 0.8307,  0.8307],[-1.0681, -1.0681]]],[[[-0.1187, -0.1187],[ 1.7802, -1.0681]]]], grad_fn=)
################################################################################手动计算
tensor([[[[ 0.8307,  0.8307],[-1.0681, -1.0681]]],[[[-0.1187, -0.1187],[ 1.7802, -1.0681]]]])
####################################################################################################
tensor([[[[ 0.2582, -0.7746],[ 0.2582,  0.2582]]],[[[-0.7746, -0.7746],[ 2.3238, -0.7746]]]], grad_fn=)
################################################################################手动计算
tensor([[[[ 0.2582, -0.7746],[ 0.2582,  0.2582]]],[[[-0.7746, -0.7746],[ 2.3238, -0.7746]]]])
####################################################################################################
tensor([[[[ 0.8307,  0.8307],[-1.0681, -1.0681]],[[ 0.2582, -0.7746],[ 0.2582,  0.2582]]],[[[-0.1187, -0.1187],[ 1.7802, -1.0681]],[[-0.7746, -0.7746],[ 2.3238, -0.7746]]]], grad_fn=)
'''

本文来自互联网用户投稿，文章观点仅代表作者本人，不代表本站立场，不承担相关法律责任。如若转载，请注明出处。 如若内容造成侵权/违法违规/事实不符，请点击【内容举报】进行投诉反馈！

标签：技术

上一篇 > 【机器学习笔记】：大话线性回归（一）
下一篇 > 深度学习——Batch Norm简介

Duilib中list控件支持ctrl和shif多行选中的实现

[ICML2015]Batch Normalization:Accelerating Deep Network Training by Reducing Internal Covariate Shif

win10系统微软输入法于eclipse ctrl+shif+f冲突间接处理办法

Codeforces Round #259 (Div. 2) B. Little Pony and Sort by Shif

读LDD3，内存映射与DMA--PAGE_SHIF…

VMware虚拟机安装XP【要先分区，再设置BOOT 启动CD，shif+上移】

更换iBus五笔的左与右Shif

sublime ctrl+shif+f 没用解决办法

idea 对 ctrl + z 的撤销是 ctrl + shif + z

计算机最早的设计师应用于,计算机应用基础选择题doc.doc

win10自带截图神器：Win+Shift+S

Python基础之文件目录操作

python简述目录_Python基础之文件目录操作(示例代码)

tp5 如何做数据采集

任务2-7(服务器字体+阿里巴巴矢量库)

html标签（1)：h1~h6,p,br,pre,hr

TI 电量计介绍与芯片选型指南

几款TI电源芯片简介

TI DSP芯片C2000系列读取FLASH数据

德州仪器(Ti)平台嵌入式开发基础

TI三相电机智能栅极驱动芯片特点分类

省选模拟（12.08） T3 圈圈圈圈圈圈圈圈

Hadoop生态圈技术栈（上）

大数据开发基础入门与项目实战（三）Hadoop核心及生态圈技术栈之6.Impala交互式查询

小猿圈之Linux下Mysql 操作命令

大数据Hadoop生态圈常用面试题

大数据开发基础入门与项目实战（三）Hadoop核心及生态圈技术栈之4.Hive DDL、DQL和数据操作

备战Noip2018模拟赛11（B组）T3 Monogatari 物语

【智能优化算法-圆圈搜索算法】基于圆圈搜索算法Circle Search Algorithm求解单目标优化问题附matlab代码

NYOJ 78 圈水池

递归问题跑道汽车绕圈问题 Python实现

Hadoop生态圈（三）：MapReduce

对BatchNorm2d的理解

简介

准备数据

验证

相关文章