关于熵的一个总结

2023-08-27 12:21:26

近似熵

（Approximate Entropy，ApEn，AE）

函数：apen = ApEn( dim, r, data, tau )
应用：用于量化时间序列波动的规律性和不可预测性的非线性动力学参数，它用一个非负数来表示一个时间序列的复杂性。
值的含义：反映了时间序列中新信息发生的可能性，越复杂的时间序列对应的近似熵越大。
- 如果一个时间序列的规律性比较强，则其近似熵值(ApEn)比较小，对应地，一个比较复杂的时间序列则对应一个较大的熵值。
参数选择：
- dim：表示重构相空间的维度，一般取2或3。
- r ：表示“相似度”的度量值，r的选择在很大程度上取决于实际应用场景，通常选择 r=0.2*std，其中std表示原时间序列的标准差。
- data：输入的数据。
- tau：为降采样参数，用于对原始信号进行降采样处理，程序中默认值为1，即不进行降采样处理。
示例可参看——近似熵理论相关知识与代码实现讲清了近似熵的原理及使用。
优势:
- a. 对数据长度的依赖性较小。 ApEn可以用于小数据样本(n<50)，并可实现实时计算。
- b. 抗噪声能力较强。如果数据含有噪声，则可以将ApEn与噪声水平进行比较，以确定原始数据中真实信息的表达程度。
matlab函数的代码如下：
该部分代码从近似熵理论相关知识与代码实现中所得。
备注：示例程序来自matlab社区File Exchange板块，由Kijoon Lee分享，在此表示感谢！

function apen = ApEn( dim, r, data, tau )
% ApEn
%   dim : embedded dimension
%   r : tolerance (typically 0.2 * std)
%   data : time-series data
%   tau : delay time for downsampling%   Changes in version 1
%       Ver 0 had a minor error in the final step of calculating ApEn
%       because it took logarithm after summation of phi's.
%       In Ver 1, I restored the definition according to original paper's
%       definition, to be consistent with most of the work in the
%       literature. Note that this definition won't work for Sample
%       Entropy which doesn't count self-matching case, because the count
%       can be zero and logarithm can fail.
%
%       A new parameter tau is added in the input argument list, so the users
%       can apply ApEn on downsampled data by skipping by tau.
%---------------------------------------------------------------------
% coded by Kijoon Lee,  kjlee@ntu.edu.sg
% Ver 0 : Aug 4th, 2011
% Ver 1 : Mar 21st, 2012
%---------------------------------------------------------------------
if nargin < 4, tau = 1; end
if tau > 1, data = downsample(data, tau); endN = length(data);
result = zeros(1,2);for j = 1:2m = dim+j-1;phi = zeros(1,N-m+1);dataMat = zeros(m,N-m+1);% setting up data matrixfor i = 1:mdataMat(i,:) = data(i:N-m+i);end% counting similar patterns using distance calculationfor i = 1:N-m+1tempMat = abs(dataMat - repmat(dataMat(:,i),1,N-m+1));boolMat = any( (tempMat > r),1);phi(i) = sum(~boolMat)/(N-m+1);end% summing over the countsresult(j) = sum(log(phi))/(N-m+1);
endapen = result(1)-result(2);end

以上来自——CSDN博主「木须耐豆皮」近似熵理论相关知识与代码实现

注：当对一个或几个整数频率的正弦分量使用该熵值时，得出的结果接近于0。
当频率是一个小数时，得到的结果会变大，将几个小数频率的正弦信号相加会逐渐变大。

样本熵

（Sample Entropy, SampEn, SE）

函数：saen = SampEntropy( dim, r, data, tau )
应用：基于近似熵(ApEn)的一种用于度量时间序列复杂性的改进方法，在评估生理时间序列的复杂性和诊断病理状态等方面均有应用。也常用于机械信号分析与故障诊断领域。
值的含义：产生新模式的概率越大，序列的复杂性程度越高，熵值就越大。
参数选择：
- 嵌入维数dim：一般取1或2；
- 相似容限r：的选择在很大程度上取决于实际应用场景，通常选择r=0.1∗std∼0.25∗std，其中std表示原时间序列的标准差。
- data ：输入的数据。
- tau：为降采样参数，用于对原始信号进行降采样处理，程序中默认值为1，即不进行降采样处理。
优势：
- 样本熵的计算不依赖数据长度。
- 样本熵具有更好的一致性，即参数m和r的变化对样本熵的影响程度是相同的。
代码：
该部分代码从排列熵、模糊熵、近似熵、样本熵的原理及MATLAB实现之样本熵中所得。可参见样本熵理论相关知识与代码实现博客中具体的原理及示例。
备注：示例程序是根据上文算法在Kijoon Lee分享的样本熵代码“SampE”的基础上修改得来的，在此表示感谢！

%% 样本熵函数
function sampEn = SampleEntropy( dim, r, data, tau )
%   注意：这个样本熵函数是在Kijoon Lee的基础上做的修改
%   样本熵算法的提出者：Richman J s，Moorman J R. Physiological time-seriesanalysis using approximate entropy and sample entropy[J. American Journal of Physiology Heart &. Circula-tory Physiology，2000，278(6):2039-2049.
%   计算给定时间序列数据的样本熵
%
%   样本熵在概念上类似于近似熵，但有以下区别：
%       1）样本熵不计算自匹配，通过在最后一步取对数，避免了可能出现的log(0)问题；
%       2）样本熵不像近似熵那样依赖数据的长度。
% 
%   dim：嵌入维数(一般取1或者2)
%   r：相似容限( 通常取0.1*Std(data)~0.25*Std(data) )
%   data：时间序列数据，data须为1xN的矩阵
%   tau：下采样延迟时间（在默认值为1的情况下，用户可以忽略此项）
% 
if nargin < 4, tau = 1; end
if tau > 1, data = downsample(data, tau); end
N = length(data);
result = zeros(1,2);
for m = dim:dim+1Bi = zeros(N-m+1,1);dataMat = zeros(N-m+1,m);% 设置数据矩阵，构造成m维的矢量for i = 1:N-m+1dataMat(i,:) = data(1,i:i+m-1);end% 利用距离计算相似模式数for j = 1:N-m+1% 计算切比雪夫距离，不包括自匹配情况dist = max(abs(dataMat - repmat(dataMat(j,:),N-m+1,1)),[],2);% 统计dist小于等于r的数目D = (dist <= r);% 不包括自匹配情况Bi(j,1) = (sum(D)-1)/(N-m);end% 求所有Bi的均值result(m-dim+1) = sum(Bi)/(N-m+1);	
end% 计算得到的样本熵值sampEn = -log(result(2)/result(1));	
end

以上来自——CSDN博主「Zhi Zhao」排列熵、模糊熵、近似熵、样本熵的原理及MATLAB实现之样本熵

模糊熵

（Fuzzy Entropy, FuzzyEn, FE）

函数：fuzzyen = Fuzzy_Entropy( dim, r, data, tau )
- 在样本熵的基础上通过引入一种指数函数——模糊隶属度函数，得到了改进的样本熵，即模糊熵。
应用：衡量的是新模式产生的概率大小，测度值越大，新模式产生的概率越大，即序列复杂度越大。
值的含义：越复杂的时间序列对应的模糊熵越大。
参数选择：
- 嵌入维数dim：较大的m能更细致地重构系统的动态演化过程。
- 相似容限r：过大的相似容限会导致信息丢失，相似容限值越大，丢失的信息越多，而太小的相似容限度则会增加结果对噪声的敏感性，一般定义r为r*SD，其中SD(Standard Deviation)为原一维时间序列的标准差。
- data ：输入的数据。
- tau：为降采样参数，用于对原始信号进行降采样处理，程序中默认值为1，即不进行降采样处理。
- 代码一：

%% 模糊熵函数
function FuzEn = FuzzyEntropy(data,dim,r,n,tau)
%
% This function calculates fuzzy entropy (FuzEn) of a univariate signal data
%
% Inputs:
%
% data: univariate signal - a vector of size 1 x N (the number of sample points)
% dim: embedding dimension
% r: threshold (it is usually equal to 0.15 of the standard deviation of a signal - because we normalize signals to have a standard deviation of 1, here, r is usually equal to 0.15)
% n: fuzzy power (it is usually equal to 2)
% tau: time lag (it is usually equal to 1)
% 模糊熵算法的提出者：Chen Weiting,Wang Zhizhong,XieHongbo,et al. Characterization of surfaceEMG signal based on fuzzy entropy. IEEE Transactions on Neural Systems and Rehabilitation Engineering. 2007,15(2):266-272.
%
if nargin == 4, tau = 1; end
if nargin == 3, n = 2; tau=1; end
if tau > 1, data = downsample(data, tau); endN = length(data);
result = zeros(1,2);for m = dim:dim+1count = zeros(N-m+1,1);dataMat = zeros(N-m+1,m);% 设置数据矩阵，构造成m维的矢量for i = 1:N-m+1dataMat(i,:) = data(1,i:i+m-1);end% 利用距离计算相似模式数for j = 1:N-m+1% 计算切比雪夫距离，不包括自匹配情况dataMat=dataMat-mean(dataMat,2);tempmat=repmat(dataMat(j,:),N-m+1,1);dist = max(abs(dataMat - tempmat),[],2);D=exp(-(dist.^n)/r);count(j) = (sum(D)-1)/(N-m);endresult(m-dim+1) = sum(count)/(N-m+1);
end% 计算得到的模糊熵值FuzEn = log(result(1)/result(2));
end

以上来自——CSDN博主「Zhi Zhao」排列熵、模糊熵、近似熵、样本熵的原理及MATLAB实现之模糊熵

代码二：

function fuzzyen = Fuzzy_Entropy( dim, r, data, tau )
% FUZZYEN Fuzzy Entropy
%   calculates the fuzzy entropy of a given time series data% Similarity definition based on vectors' shapes, together with the
% exclusion of self-matches, earns FuzzyEn stronger relative consistency
% and less dependence on data length.%   dim     : embedded dimension 
%   r       : tolerance (typically 0.2 * std)
%   data    : time-series data
%   tau     : delay time for downsampling (user can omit this, in which case
%             the default value is 1)
%if nargin < 4, tau = 1; end
if tau > 1, data = downsample(data, tau); endN = length(data);
Phi = zeros(1,2);for m = dim:dim+1Ci = zeros(1,N-m+1);dataMat = zeros(m,N-m+1);% setting up data matrix - form vectorsfor j = 1:mdataMat(j,:) = data(j:N-m+j);end% baselineU0 = mean(dataMat);% remove baseline and calculate the absolute valuesSm = abs(dataMat - repmat(U0,m,1));% Given vector Si, calculate the similarity degree between its'% neighboring vector Sjfor i = 1:N-m+1Sm_tmp = Sm;Sm_tmp(:,i) = []; % excluded self-matches% maximum absolute difference of the corresponding scalar components% of Si and Sj (j≠i)dij = max(repmat(Sm(:,i),1,N-m) - Sm_tmp);% similarity degreeAij = exp(-log(2)*(dij/r).^2);% averaging all the similarity degree of its neighboring vectors SjCi(i) = sum(Aij)/(N - m);end% summing over the countsPhi(m-dim+1) = sum(Ci)/(N-m+1); % φ_m and φ_m+1end
fuzzyen = log(Phi(1))-log(Phi(2));  % fuzzyen = ln(φ_m)-ln(φ_m+1)
end

以上来自——CSDN博主「木须耐豆皮」模糊熵理论相关知识与代码实现

排列熵

（Permutation entropy，PE）

函数：[pe ,hist] = PermutationEntropy(y,m,t)
应用：熵可以表征信号的复杂性以及度量信息的不确定性，适用于处理非线性问题。排列熵通过比较相邻时间序列的值，检测时间序列的动态变化。一种检测动力学突变和时间序列随机性的方法，能够定量评估信号序列中含有的随机噪声。
值的含义：熵值越小，说明时间序列越简单，越规则；反之，熵值越大，则时间序列越复杂，越随机。
参数选择：
- y：输入数据。
- m: 嵌入维数。
- t：延迟时间。
  - 相空间重构：eDim（m）为嵌入维数,eLag（t）为延迟时间。
  - 当X具有多列和多行时，每列将被视为独立的时间序列，该算法对X的每一列假设相同的时间延迟和嵌入维度，并以标量返回ESTDIM和ESTLAG。
  - [~,eLag,eDim] = phaseSpaceReconstruction(X);
代码：

%% 排列熵算法
function [pe ,hist] = PermutationEntropy(y,m,t)%  Calculate the permutation entropy(PE)
%  排列熵算法的提出者：Bandt C，Pompe B. Permutation entropy:a natural complexity measure for time series[J]. Physical Review Letters,2002,88(17):174102.%  Input:   y: time series;
%           m: order of permuation entropy 嵌入维数
%           t: delay time of permuation entropy,延迟时间% Output: 
%           pe:    permuation entropy
%           hist:  the histogram for the order distribution
ly = length(y);
permlist = perms(1:m);
[h,~]=size(permlist);
c(1:length(permlist))=0;for j=1:ly-t*(m-1)[~,iv]=sort(y(j:t:j+t*(m-1)));for jj=1:hif (abs(permlist(jj,:)-iv))==0c(jj) = c(jj) + 1 ;endendend
hist = c;
c=c(c~=0);
p = c/sum(c);
pe = -sum(p .* log(p));
% 归一化
pe=pe/log(factorial(m));
end

以上来自——CSDN博主「Zhi Zhao」排列熵、模糊熵、近似熵、样本熵的原理及MATLAB实现

多尺度排列熵

（Multiscale entropy, MSE）

函数：[mse sf] = MSE_Costa2005(x,nSf,m,r)
应用：多尺度熵将样本熵扩展到多个时间尺度，以便在时间尺度不确定时提供额外的观察视角。样本熵的问题在于它没有很好地考虑到时间序列中可能存在的不同时间尺度。
- 在分析语音信号时，在单词时间尺度下统计信号的复杂度会比统计整个语音片段的复杂度更加有效。但如果你不知道音频信号代表语音，甚至对语音概念没有任何了解，你就不知道应该运用什么时间尺度以从原始信号中获得更多有用的信息。因此，通过多个时间尺度来分析问题将会得到更多信息。
- 在脑电图中，潜在的脑电模式是未知的，因此相关的时间尺度也是未知的。所以，需要通过多尺度样本熵来分析哪个尺度对特定场合下脑电信号的分析更有用。
值的含义：熵值越小，说明时间序列越简单，越规则；反之，熵值越大，则时间序列越复杂，越随机。
参数选择：
x：输入数据。
nSf：比例因子数。
m：比较的线段长度，一般选2。
r：匹配的阈值。通常选择在时间序列样本偏差的10%到20%之间；当x进行z变换时：将公差定义为r乘以标准偏差。例如可以选*std(y)0.15。（为了避免噪声对样本熵估计的显著贡献，r必须大于大部分的信号噪声。选择r的另一个标准是基于信号的动态特性（signal dynamics）。）
代码：

function [mse,sf] = MSE_Costa2005(x,nSf,m,r)
% [mse(:,ii) sf] = MSE_Costa2005(y(:,ii),20,2,std(y(:,ii))*0.15);% [mse sf] = MSE_Costa2005(x,nSf,m,r)
%
% x   - input signal vector (e.g., EEG signal or sound signal)
% nSf - number of scale factors
% m   - template length (epoch length); Costa used m = 2 throughout 
% r   - matching threshold; typically chosen to be between 10% and 20% of
%       the sample deviation of the time series; when x is z-transformed:
%       defined the tolerance as r times the standard deviation
%
% mse - multi-scale entropy
% sf  - scale factor corresponding to mse
%
% Interpretation: Costa interprets higher values of entropy to reflect more
% information at this scale (less predictable when if random). For 1/f
% pretty constant across scales.
%
% References:
% Costa et al. (2002) Multiscale Entropy Analysis of Complex Physiologic
%    Time Series. PHYSICAL REVIEW LETTERS 89
% Costa et al. (2005) Multiscale entropy analysis of biological signals.
%    PHYSICAL REVIEW E 71, 021906.
%
% Requires: SampleEntropy.m
%
% Description: The script calculates multi-scale entropy using a
% coarse-graining approach.
%
% ---------
%
%    Copyright (C) 2017, B. Herrmann
%    This program is free software: you can redistribute it and/or modify
%    it under the terms of the GNU General Public License as published by
%    the Free Software Foundation, either version 3 of the License, or
%    (at your option) any later version.
%
%    This program is distributed in the hope that it will be useful,
%    but WITHOUT ANY WARRANTY; without even the implied warranty of
%    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
%    GNU General Public License for more details.
%
%    You should have received a copy of the GNU General Public License
%    along with this program.  If not, see <http://www.gnu.org/licenses/>.
%
% ----------------------------------------------------------------------
% B. Herrmann, Email: herrmann.b@gmail.com, 2017-05-06% pre-allocate mse vector
mse = zeros([1 nSf]);% coarse-grain and calculate sample entropy for each scale factor
for ii = 1 : nSf% get filter weightsf = ones([1 ii]);f = f/sum(f);% get coarse-grained time series (i.e., average data within non-overlapping time windows)y = filter(f,1,x);y = y(length(f):end);y = y(1:length(f):end);% calculate sample entropymse(ii) = SampleEntropy(y,m,r,0);
end% get sacle factors
sf = 1 : nSf;

以上来自——CSDN博主「木须耐豆皮」多尺度熵

参考网址

近似熵理论相关知识与代码实现
样本熵理论相关知识与代码实现
模糊熵理论相关知识与代码实现
排列熵、模糊熵、近似熵、样本熵的原理及MATLAB实现之样本熵
排列熵、模糊熵、近似熵、样本熵的原理及MATLAB实现之模糊熵
排列熵、模糊熵、近似熵、样本熵的原理及MATLAB实现
多尺度熵

本文来自互联网用户投稿，文章观点仅代表作者本人，不代表本站立场，不承担相关法律责任。如若转载，请注明出处。 如若内容造成侵权/违法违规/事实不符，请点击【内容举报】进行投诉反馈！

标签：技术

上一篇 > 【机器学习数据集的信息熵】信息熵及其Python计算实现
下一篇 > Linux系统管理大作业: 利用Celery 实现分布式计算

Duilib中list控件支持ctrl和shif多行选中的实现

[ICML2015]Batch Normalization:Accelerating Deep Network Training by Reducing Internal Covariate Shif

win10系统微软输入法于eclipse ctrl+shif+f冲突间接处理办法

Codeforces Round #259 (Div. 2) B. Little Pony and Sort by Shif

读LDD3，内存映射与DMA--PAGE_SHIF…

VMware虚拟机安装XP【要先分区，再设置BOOT 启动CD，shif+上移】

更换iBus五笔的左与右Shif

sublime ctrl+shif+f 没用解决办法

idea 对 ctrl + z 的撤销是 ctrl + shif + z

计算机最早的设计师应用于,计算机应用基础选择题doc.doc

win10自带截图神器：Win+Shift+S

Python基础之文件目录操作

python简述目录_Python基础之文件目录操作(示例代码)

tp5 如何做数据采集

任务2-7(服务器字体+阿里巴巴矢量库)

html标签（1)：h1~h6,p,br,pre,hr

TI 电量计介绍与芯片选型指南

几款TI电源芯片简介

TI DSP芯片C2000系列读取FLASH数据

德州仪器(Ti)平台嵌入式开发基础

TI三相电机智能栅极驱动芯片特点分类

省选模拟（12.08） T3 圈圈圈圈圈圈圈圈

Hadoop生态圈技术栈（上）

大数据开发基础入门与项目实战（三）Hadoop核心及生态圈技术栈之6.Impala交互式查询

小猿圈之Linux下Mysql 操作命令

大数据Hadoop生态圈常用面试题

大数据开发基础入门与项目实战（三）Hadoop核心及生态圈技术栈之4.Hive DDL、DQL和数据操作

备战Noip2018模拟赛11（B组）T3 Monogatari 物语

【智能优化算法-圆圈搜索算法】基于圆圈搜索算法Circle Search Algorithm求解单目标优化问题附matlab代码

NYOJ 78 圈水池

递归问题跑道汽车绕圈问题 Python实现

Hadoop生态圈（三）：MapReduce

关于熵的一个总结

近似熵

样本熵

模糊熵

排列熵

多尺度排列熵

参考网址

相关文章