java entropy_Java ContingencyTables.entropy方法代码示例
import weka.core.ContingencyTables; //导入方法依赖的package包/类
/**
* Test using Fayyad and Irani's MDL criterion.
*
* @param priorCounts
* @param bestCounts
* @param numInstances
* @param numCutPoints
* @return true if the splits is acceptable
*/
private boolean FayyadAndIranisMDL(double[] priorCounts,
double[][] bestCounts,
double numInstances,
int numCutPoints) {
double priorEntropy, entropy, gain;
double entropyLeft, entropyRight, delta;
int numClassesTotal, numClassesRight, numClassesLeft;
// Compute entropy before split.
priorEntropy = ContingencyTables.entropy(priorCounts);
// Compute entropy after split.
entropy = ContingencyTables.entropyConditionedOnRows(bestCounts);
// Compute information gain.
gain = priorEntropy - entropy;
// Number of classes occuring in the set
numClassesTotal = 0;
for (int i = 0; i < priorCounts.length; i++) {
if (priorCounts[i] > 0) {
numClassesTotal++;
}
}
// Number of classes occuring in the left subset
numClassesLeft = 0;
for (int i = 0; i < bestCounts[0].length; i++) {
if (bestCounts[0][i] > 0) {
numClassesLeft++;
}
}
// Number of classes occuring in the right subset
numClassesRight = 0;
for (int i = 0; i < bestCounts[1].length; i++) {
if (bestCounts[1][i] > 0) {
numClassesRight++;
}
}
// Entropy of the left and the right subsets
entropyLeft = ContingencyTables.entropy(bestCounts[0]);
entropyRight = ContingencyTables.entropy(bestCounts[1]);
// Compute terms for MDL formula
delta = Utils.log2(Math.pow(3, numClassesTotal) - 2) -
(((double) numClassesTotal * priorEntropy) -
(numClassesRight * entropyRight) -
(numClassesLeft * entropyLeft));
// Check if split is to be accepted
return (gain > (Utils.log2(numCutPoints) + delta) / (double)numInstances);
}
本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!
