高斯判别分析
高斯判别分析(Gaussian discriminative analysis)属于概率生成式模型,并不是直接计算p(y|x)的概率,而是基于bayes,比较p(y=1|x)和p(y=0|x)的大小,从而确定分类
贝叶斯公式:
p ( y ∣ x ) = p ( x ∣ y ) p ( y ) p ( x ) p(y|x)=\frac {p(x|y)p(y)}{p(x)} p(y∣x)=p(x)p(x∣y)p(y)
p(x)项和p(y)没有关系,所以可以去掉,原式可以写为基于联合概率建模,形式
a r g m a x p ( y ∣ x ) = a r g m a x p ( x ∣ y ) p ( y ) = a r g m a x p ( x , y ) argmax \ p(y|x)=argmax\ p(x|y)p(y)=argmax\ p(x,y) argmax p(y∣x)=argmax p(x∣y)p(y)=argmax p(x,y)
这里 p ( y ) 是 先 验 概 率 , p ( y ∣ x ) 是 后 验 概 率 , p ( x ∣ y ) 是 似 然 函 数 p(y)是先验概率,p(y|x)是后验概率,p(x|y)是似然函数 p(y)是先验概率,p(y∣x)是后验概率,p(x∣y)是似然函数
假定:
p ( y ) ∼ B ( 1 , p ) p(y) \ \ \thicksim B(1,p) p(y) ∼B(1,p)
p ( x ∣ y = 1 ) ∼ N ( μ 1 , σ ) p(x|y=1) \ \ \thicksim N(\mu_1,\sigma) p(x∣y=1) ∼N(μ1,σ)
p ( x ∣ y = 0 ) ∼ N ( μ 2 , σ ) p(x|y=0) \ \ \thicksim N(\mu_2,\sigma) p(x∣y=0) ∼N(μ2,σ)
令 y = 1 y=1 y=1
则有:
p ( y ) = ρ y ( 1 − ρ ) 1 − y p(y)=\rho^y(1-\rho)^{1-y} p(y)=ρy(1−ρ)1−y
p ( x ∣ y ) = N ( μ 1 , σ ) y N ( μ 2 , σ ) 1 − y p(x|y) = N(\mu_1,\sigma)^yN(\mu_2,\sigma)^{1-y} p(x∣y)=N(μ1,σ)yN(μ2,σ)1−y
建立似然函数有
L ( θ ) = log ∏ P ( x ∣ y ) p ( y ) L(\theta) = \log\prod P(x|y)p(y) L(θ)=log∏P(x∣y)p(y)
= ∑ log P ( x ∣ y ) p ( y ) \quad =\sum \log P(x|y)p(y) =∑logP(x∣y)p(y)
= ∑ log ( N ( μ 1 , σ ) y N ( μ 2 , σ ) 1 − y ρ y ( 1 − ρ ) 1 − y ) \quad =\sum \log ( N(\mu_1,\sigma)^yN(\mu_2,\sigma)^{1-y}\rho^y(1-\rho)^{1-y}) =∑log(N(μ1,σ)yN(μ2,σ)1−yρy(1−ρ)1−y)
= ∑ log ( N ( μ 1 , σ ) y N ( μ 2 , σ ) 1 − y ) + log ( ρ y ( 1 − ρ ) 1 − y ) \quad =\sum \log ( N(\mu_1,\sigma)^yN(\mu_2,\sigma)^{1-y})+\log(\rho^y(1-\rho)^{1-y}) =∑log(N(μ1,σ)yN(μ2,σ)1−y)+log(ρy(1−ρ)1−y)
= ∑ log ( N ( μ 1 , σ ) y ) + log ( N ( μ 2 , σ ) 1 − y ) + log ( ρ y ( 1 − ρ ) 1 − y ) \quad =\sum \log ( N(\mu_1,\sigma)^y)+\log(N(\mu_2,\sigma)^{1-y})+\log(\rho^y(1-\rho)^{1-y}) =∑log(N(μ1,σ)y)+log(N(μ2,σ)1−y)+log(ρy(1−ρ)1−y)
所以 θ = ( μ 1 , μ 2 , σ , ρ ) \theta=(\mu_1,\mu_2,\sigma,\rho) θ=(μ1,μ2,σ,ρ)
最后求解 θ ^ = a r g m a x L ( θ ) \hat \theta = argmaxL(\theta) θ^=argmaxL(θ)
1:求 ρ \rho ρ
∂ L ( θ ) ∂ ρ = d ∑ ( log ρ y + log ( 1 − ρ ) 1 − y ) \frac{\partial L(\theta)}{\partial \rho}=d\sum( \log \rho^y+\log(1-\rho)^{1-y}) ∂ρ∂L(θ)=d∑(logρy+log(1−ρ)1−y)
= ∑ ( y 1 ρ + ( 1 − y ) 1 1 − ρ ( − 1 ) ) = 0 \quad =\sum(y \frac{1}{\rho}+(1-y) \frac{1}{1-\rho}(-1)) =0 =∑(yρ1+(1−y)1−ρ1(−1))=0
= ∑ ( y ( 1 − ρ ) − ( 1 − y ) ρ ) = 0 \quad =\sum(y(1-\rho)-(1-y) \rho) =0 =∑(y(1−ρ)−(1−y)ρ)=0
= ∑ ( ρ − y ρ − y + y ρ ) = 0 \quad =\sum(\rho-y\rho-y+y \rho) =0 =∑(ρ−yρ−y+yρ)=0
= ∑ ( ρ − y ) = 0 \quad =\sum(\rho-y) =0 =∑(ρ−y)=0
所以有
∑ y = ∑ ρ \quad \sum y=\sum \rho ∑y=∑ρ
因为:
∑ = N \quad \sum =N ∑=N
y=1的个数有
∑ y = N 1 \quad \sum y=N1 ∑y=N1
y=0的个数有
∑ ( 1 − y ) = N 2 \quad \sum (1-y)=N2 ∑(1−y)=N2
N 1 + N 2 = N N1+N2 =N N1+N2=N
所以
N 1 = N ρ N1=N\rho N1=Nρ
ρ = N 1 N \rho = \frac{N_1}{N} ρ=NN1
2:求 μ 1 \mu_1 μ1
∂ L ( θ ) ∂ μ 1 = d ∑ log ( N ( μ 1 , σ ) y ) \frac{\partial L(\theta)}{\partial \mu_1}=d\sum \log ( N(\mu_1,\sigma)^y) ∂μ1∂L(θ)=d∑log(N(μ1,σ)y)
定义:
∑ = [ σ 1 2 0 ⋯ 0 0 σ 2 2 ⋯ 0 ⋮ ⋯ ⋯ ⋮ 0 0 ⋯ σ n 2 ] ∑_{}^{} = \left[ \begin{matrix} σ_{1}^2&0&\cdots&0\\ 0&σ_{2}^2&\cdots&0\\ \vdots&\cdots&\cdots&\vdots\\ 0&0&\cdots&σ_{n}^2 \end{matrix}\right] ∑=⎣⎢⎢⎢⎡σ120⋮00σ22⋯0⋯⋯⋯⋯00⋮σn2⎦⎥⎥⎥⎤
∑ \sum ∑代表协方差矩阵, i行j列的元素值表示不同元素的协方差
因为现在变量之间是相互独立的,所以只有对角线上 (i = j)存在非0元素,其他地方都等于0,且元素与它本身的协方差就等于方差
∑是一个对角阵,根据对角矩阵的性质,它的逆矩阵表示为:
( ∑ ) − 1 = [ 1 σ 1 2 0 ⋯ 0 0 1 σ 2 2 ⋯ 0 ⋮ ⋯ ⋯ ⋮ 0 0 ⋯ 1 σ n 2 ] (∑_{}^{})^{-1} = \left[ \begin{matrix} \frac{1}{σ_{1}^2}&0&\cdots&0\\ 0&\frac{1}{σ_{2}^2}&\cdots&0\\ \vdots&\cdots&\cdots&\vdots\\ 0&0&\cdots&\frac{1}{σ_{n}^2} \end{matrix}\right] (∑)−1=⎣⎢⎢⎢⎢⎡σ1210⋮00σ221⋯0⋯⋯⋯⋯00⋮σn21⎦⎥⎥⎥⎥⎤
对角矩阵的行列式 = 对角元素的乘积
σ z = ∣ ∑ ∣ 1 2 = σ 1 σ 2 . . . . . σ n σ_{z}= \left|∑_{}^{}\right|^\frac{1}{2} =σ_{1}σ_{2}.....σ_{n} σz=∣∑∣21=σ1σ2.....σn
展开有
∂ L ( θ ) ∂ μ 1 = d ∑ y log ( 1 2 π ) n ∣ ∑ ∣ 1 2 e x p ( − 1 2 ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) ) \frac{\partial L(\theta)}{\partial \mu_1}=d\sum y\log (\frac{1}{\sqrt{2\pi})^n|\sum|^{\frac{1}{2}}}exp(-\frac{1}{2}(x-\mu_1)^T\sum^{-1}(x-\mu_1)) ∂μ1∂L(θ)=d∑ylog(2π)n∣∑∣211exp(−21(x−μ1)T∑−1(x−μ1))
∂ L ( θ ) ∂ μ 1 = d ∑ y log ( 1 2 π ) n ∣ ∑ ∣ 1 2 ) − y 1 2 ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) \frac{\partial L(\theta)}{\partial \mu_1}=d\sum y\log (\frac{1}{\sqrt{2\pi})^n|\sum|^{\frac{1}{2}}})-y\frac{1}{2}(x-\mu_1)^T\sum^{-1}(x-\mu_1) ∂μ1∂L(θ)=d∑ylog(2π)n∣∑∣211)−y21(x−μ1)T∑−1(x−μ1)
这里的第一个 ∑ \sum ∑是求和符号
第一项和 μ 1 \mu_1 μ1无关,所以也就是
− 1 2 d μ ∑ y ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) = 0 - \frac{1}{2}d_\mu\sum y(x-\mu_1)^T\sum^{-1}(x-\mu_1) =0 −21dμ∑y(x−μ1)T∑−1(x−μ1)=0
− 1 2 d μ ∑ y ( x T ∑ − 1 − μ 1 T ∑ − 1 ) ( x − μ 1 ) = 0 - \frac{1}{2}d_\mu\sum y(x^T\sum^{-1}-\mu_1^T\sum^{-1})(x-\mu_1) =0 −21dμ∑y(xT∑−1−μ1T∑−1)(x−μ1)=0
− 1 2 d μ ∑ y ( x T ∑ − 1 x − x T ∑ − 1 μ 1 − μ 1 T ∑ − 1 x + μ 1 T ∑ − 1 μ 1 ) = 0 - \frac{1}{2}d_\mu\sum y(x^T\sum^{-1}x-x^T\sum^{-1}\mu_1-\mu_1^T\sum^{-1}x+\mu_1^T\sum^{-1}\mu_1)=0 −21dμ∑y(xT∑−1x−xT∑−1μ1−μ1T∑−1x+μ1T∑−1μ1)=0
− 1 2 d μ ∑ y ( x T ∑ − 1 x − x T ∑ − 1 μ 1 − μ 1 T ∑ − 1 x + μ 1 T ∑ − 1 μ 1 ) = 0 - \frac{1}{2}d_\mu\sum y(x^T\sum^{-1}x-x^T\sum^{-1}\mu_1-\mu_1^T\sum^{-1}x+\mu_1^T\sum^{-1}\mu_1)=0 −21dμ∑y(xT∑−1x−xT∑−1μ1−μ1T∑−1x+μ1T∑−1μ1)=0
也就是
− 1 2 ∑ y ( − 2 x T ∑ − 1 + 2 ∑ − 1 μ 1 ) = 0 - \frac{1}{2}\sum y(-2x^T\sum^{-1}+2\sum^{-1}\mu_1)=0 −21∑y(−2xT∑−1+2∑−1μ1)=0
∑ y ( x T ∑ − 1 − ∑ − 1 μ 1 ) = 0 \sum y(x^T\sum^{-1}-\sum^{-1}\mu_1)=0 ∑y(xT∑−1−∑−1μ1)=0
∑ y ( x − μ 1 ) = 0 \sum y(x-\mu_1)=0 ∑y(x−μ1)=0
∑ x y = ∑ y μ 1 \sum xy=\sum y\mu_1 ∑xy=∑yμ1
μ 1 = ∑ x y ∑ y = ∑ x y N 1 \mu_1 =\frac{\sum xy}{\sum y} =\frac{\sum xy}{N1} μ1=∑y∑xy=N1∑xy
求 ∑ \sum ∑
矩阵的迹相关定理:
t r ( A ) = ∑ A i i tr(A)=\sum A_{ii} tr(A)=∑Aii
t r ( A B ) = t r ( B A ) tr(AB)=tr(BA) tr(AB)=tr(BA)
t r ( A B C ) = t r ( C B A ) tr(ABC)=tr(CBA) tr(ABC)=tr(CBA)
∂ t r ( A B ) ∂ A = B T \frac{\partial tr(AB)}{\partial A}=B^T ∂A∂tr(AB)=BT
|A|表示矩阵A的行列式
∂ ∣ A ∣ ∂ A = ∣ A ∣ . A − 1 \frac{\partial |A|}{\partial A}=|A|.A^{-1} ∂A∂∣A∣=∣A∣.A−1
如果a∈实数,则有tr(a)=a
令:
C 1 = { x i ∣ y = 1 ; x i ∈ 1... n } C1=\{x_i |y=1;x_i \in 1...n\} C1={xi∣y=1;xi∈1...n}
C 2 = { x i ∣ y = 0 ; x i ∈ 1... n } C2=\{x_i |y=0;x_i \in 1...n\} C2={xi∣y=0;xi∈1...n}
∣ C 1 ∣ = N 1 |C1|=N1 ∣C1∣=N1
∣ C 2 ∣ = N 2 |C2|=N2 ∣C2∣=N2
N 1 + N 2 = N N1+N2=N N1+N2=N
原函数对 ∑ \sum ∑ 求偏导有
∂ J ( θ ) ∂ ∑ = d ( ∑ x i ∈ C 1 log ( N ( μ 1 , ∑ ) + ∑ x i ∈ C 2 log ( N ( μ 2 , ∑ ) ) = 0 \frac{\partial J(\theta)}{\partial \sum}=d(\displaystyle \sum_{x_i \in C1}\log ( N(\mu_1,\sum) +\displaystyle \sum_{x_i \in C2}\log ( N(\mu_2,\sum)) =0 ∂∑∂J(θ)=d(xi∈C1∑log(N(μ1,∑)+xi∈C2∑log(N(μ2,∑))=0
令:
f ( μ 1 ) = ∑ x i ∈ C 1 log ( N ( μ 1 , ∑ ) f(\mu_1) =\displaystyle \sum_{x_i \in C1}\log ( N(\mu_1,\sum) f(μ1)=xi∈C1∑log(N(μ1,∑)
f ( μ 1 ) = ∑ x i ∈ C 1 log ( 1 2 π ) n ∣ ∑ ∣ 1 2 e x p ( − 1 2 ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) ) f(\mu_1) = \sum_{x_i \in C1}\log ( \frac{1}{\sqrt{2\pi})^n|\sum|^{\frac{1}{2}}}exp(-\frac{1}{2}(x-\mu_1)^T \sum^{-1}(x-\mu_1)) f(μ1)=∑xi∈C1log(2π)n∣∑∣211exp(−21(x−μ1)T∑−1(x−μ1))
f ( μ 1 ) = ∑ x i ∈ C 1 log 1 2 π ) n ∣ ∑ ∣ 1 2 − 1 2 ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) f(\mu_1) = \sum_{x_i \in C1}\log \frac{1}{\sqrt{2\pi})^n|\sum|^{\frac{1}{2}}}-\frac{1}{2}(x-\mu_1)^T\sum^{-1}(x-\mu_1) f(μ1)=∑xi∈C1log2π)n∣∑∣211−21(x−μ1)T∑−1(x−μ1)
f ( μ 1 ) = ∑ x i ∈ C 1 log 1 2 π ) n − 1 2 log ∣ ∑ ∣ − 1 2 ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) f(\mu_1) =\sum_{x_i \in C1}\log \frac{1}{\sqrt{2\pi})^n}-{\frac{1}{2}}\log |\sum|-\frac{1}{2}(x-\mu_1)^T\sum^{-1}(x-\mu_1) f(μ1)=∑xi∈C1log2π)n1−21log∣∑∣−21(x−μ1)T∑−1(x−μ1)
把求和符号带人有
f ( μ 1 ) = ∑ x i ∈ C 1 log 1 2 π ) n − 1 2 ∑ x i ∈ C 1 log ∣ ∑ ∣ − 1 2 ∑ x i ∈ C 1 ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) f(\mu_1) =\sum_{x_i \in C1}\log \frac{1}{\sqrt{2\pi})^n}-{\frac{1}{2}} \sum_{x_i \in C1}\log |\sum|-\frac{1}{2}\sum_{x_i \in C1}(x-\mu_1)^T\sum^{-1}(x-\mu_1) f(μ1)=∑xi∈C1log2π)n1−21∑xi∈C1log∣∑∣−21∑xi∈C1(x−μ1)T∑−1(x−μ1)
∑ x i ∈ C 1 log 1 2 π ) n 和 ∑ 无 关 , 记 作 常 识 C 3 \displaystyle \sum_{x_i \in C1}\log \frac{1}{\sqrt{2\pi})^n} 和\sum无关,记作常识C3 xi∈C1∑log2π)n1和∑无关,记作常识C3
− 1 2 ∑ x i ∈ C 1 log ∣ ∑ ∣ = − 1 2 N 1 log ∣ ∑ ∣ -{\frac{1}{2}}\displaystyle \sum_{x_i \in C1}\log |\sum|=-\frac{1}{2}N1\log |\sum| −21xi∈C1∑log∣∑∣=−21N1log∣∑∣
由于 ( x − μ 1 ) T (x-\mu_1)^T (x−μ1)T是(1xn)维
∑ − 1 \sum^{-1} ∑−1是pxp 维
( x − μ 1 ) (x-\mu_1) (x−μ1)是px1维
所以 ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) (x-\mu_1)^T\sum^{-1}(x-\mu_1) (x−μ1)T∑−1(x−μ1)结果是实数
也就可以表示为
( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) = t r ( ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) ) (x-\mu_1)^T\sum^{-1}(x-\mu_1)=tr((x-\mu_1)^T\sum^{-1}(x-\mu_1)) (x−μ1)T∑−1(x−μ1)=tr((x−μ1)T∑−1(x−μ1))
= t r ( ( x − μ 1 ) T ( x − μ 1 ) ∑ − 1 ) =tr((x-\mu_1)^T(x-\mu_1)\sum^{-1}) =tr((x−μ1)T(x−μ1)∑−1)
∑ x i ∈ C 1 ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) = ∑ x i ∈ C 1 t r ( ( x − μ 1 ) T ( x − μ 1 ) ∑ − 1 ) \sum_{x_i \in C1}(x-\mu_1)^T\sum^{-1}(x-\mu_1)= \sum_{x_i \in C1}tr((x-\mu_1)^T(x-\mu_1)\sum^{-1}) ∑xi∈C1(x−μ1)T∑−1(x−μ1)=∑xi∈C1tr((x−μ1)T(x−μ1)∑−1)
∑ x i ∈ C 1 ( x − μ 1 ) T ∑ − 1 ( x − μ 1 ) = t r ( ∑ x i ∈ C 1 ( x − μ 1 ) T ( x − μ 1 ) ∑ − 1 ) \sum_{x_i \in C1}(x-\mu_1)^T\sum^{-1}(x-\mu_1)=tr( \sum_{x_i \in C1}(x-\mu_1)^T(x-\mu_1)\sum^{-1}) ∑xi∈C1(x−μ1)T∑−1(x−μ1)=tr(∑xi∈C1(x−μ1)T(x−μ1)∑−1)
因为有:方差矩阵
S 1 = 1 N 1 ( ∑ x i ∈ C 1 ( x − μ 1 ) T ( x − μ 1 ) ) S1=\frac{1}{N1}(\displaystyle \sum_{x_i \in C1}(x-\mu_1)^T(x-\mu_1)) S1=N11(xi∈C1∑(x−μ1)T(x−μ1))
所以
∑ x i ∈ C 1 ( x − μ 1 ) T ∣ ∑ ∣ − 1 ( x − μ 1 ) = N 1 t r ( S 1 ∑ − 1 ) \sum_{x_i \in C1}(x-\mu_1)^T|\sum|^{-1}(x-\mu_1)=N1tr(S1\sum^{-1}) ∑xi∈C1(x−μ1)T∣∑∣−1(x−μ1)=N1tr(S1∑−1)
f ( μ 1 ) = − 1 2 ( C 3 + N 1 log ∣ ∑ ∣ + N 1 t r ( S 1 ∑ − 1 ) f(\mu_1) =-\frac{1}{2}(C3+N1\log |\sum|+N1tr(S1\sum^{-1}) f(μ1)=−21(C3+N1log∣∑∣+N1tr(S1∑−1)
同理:
f ( μ 2 ) = − 1 2 ( C 4 + N 2 log ∣ ∑ ∣ + N 2 t r ( S 2 ∑ − 1 ) f(\mu_2) =-\frac{1}{2}(C4+N2\log |\sum|+N2tr(S2\sum^{-1}) f(μ2)=−21(C4+N2log∣∑∣+N2tr(S2∑−1)
对原函数求导可以写为
∂ J ( θ ) ∂ ∑ = d ( f ( μ 1 ) + f ( μ 2 ) ) = 0 \frac{\partial J(\theta)}{\partial \sum}=d(f(\mu_1)+f(\mu_2)) =0 ∂∑∂J(θ)=d(f(μ1)+f(μ2))=0
∂ J ( θ ) ∂ ∑ = d ( − 1 2 ( N 1 log ∣ ∑ ∣ + N 1 t r ( S 1 ∑ − 1 ) − 1 2 ( N 2 log ∣ ∑ ∣ + N 2 t r ( S 2 ∑ − 1 ) ) = 0 \frac{\partial J(\theta)}{\partial \sum}=d(-\frac{1}{2}(N1\log |\sum|+N1tr(S1\sum^{-1})-\frac{1}{2}(N2\log |\sum|+N2tr(S2\sum^{-1})) =0 ∂∑∂J(θ)=d(−21(N1log∣∑∣+N1tr(S1∑−1)−21(N2log∣∑∣+N2tr(S2∑−1))=0
∂ J ( θ ) ∂ ∑ = d ( − 1 2 ( N log ∣ ∑ ∣ + N 1 t r ( S 1 ∑ − 1 ) + N 2 t r ( S 2 ∑ − 1 ) ) ) = 0 \frac{\partial J(\theta)}{\partial \sum}=d(-\frac{1}{2}(N\log |\sum|+N1tr(S1\sum^{-1})+N2tr(S2\sum^{-1}))) =0 ∂∑∂J(θ)=d(−21(Nlog∣∑∣+N1tr(S1∑−1)+N2tr(S2∑−1)))=0
∂ J ( θ ) ∂ ∑ = − 1 2 ( N 1 ∣ ∑ ∣ ∣ ∑ ∣ ∑ − 1 + N 1 t r ( ∑ − 1 S 1 ) + N 2 t r ( ∑ − 1 S 2 ) ) = 0 \frac{\partial J(\theta)}{\partial \sum}=-\frac{1}{2}(N\frac{1}{|\sum|} |\sum|\sum^{-1}+N1tr(\sum^{-1}S1)+N2tr(\sum^{-1}S2)) =0 ∂∑∂J(θ)=−21(N∣∑∣1∣∑∣∑−1+N1tr(∑−1S1)+N2tr(∑−1S2))=0
∂ J ( θ ) ∂ ∑ = − 1 2 ( N ∑ − 1 + N 1 t r ( ∑ − 1 S 1 ) + N 2 t r ( ∑ − 1 S 2 ) ) = 0 \frac{\partial J(\theta)}{\partial \sum}=-\frac{1}{2}(N\sum^{-1}+N1tr(\sum^{-1}S1)+N2tr(\sum^{-1}S2)) =0 ∂∑∂J(θ)=−21(N∑−1+N1tr(∑−1S1)+N2tr(∑−1S2))=0
∂ J ( θ ) ∂ ∑ = − 1 2 ( N ∑ − 1 − N 1 S 1 T ∑ − 2 − N 2 S 2 T ∑ − 2 ) = 0 \frac{\partial J(\theta)}{\partial \sum}=-\frac{1}{2}(N\sum^{-1}-N1S1^T\sum^{-2}-N2S2^T\sum^{-2})=0 ∂∑∂J(θ)=−21(N∑−1−N1S1T∑−2−N2S2T∑−2)=0
两边乘以 ∑ 2 \sum^{2} ∑2有
N ∑ = N 1 S 1 T + N 2 S 2 T N\sum =N1S1^T+N2S2^T N∑=N1S1T+N2S2T
∑ = N 1 S 1 T + N 2 S 2 T N \sum =\frac{N1S1^T+N2S2^T}{N} ∑=NN1S1T+N2S2T
由于方差矩阵的对称型,所以可写为
∑ = N 1 S 1 + N 2 S 2 N \sum =\frac{N1S1+N2S2}{N} ∑=NN1S1+N2S2
本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!
