|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.knime.base.node.mine.decisiontree2.learner.SplitQualityMeasure
org.knime.base.node.mine.decisiontree2.learner.SplitQualityGini
public class SplitQualityGini
Implements the gini index split quality measure. This gini index is subtracted from 1 (worst value), thus the gini index is also better if it is larger than another gini index (same as for gain ratio).
| Constructor Summary | |
|---|---|
SplitQualityGini()
|
|
| Method Summary | |
|---|---|
double |
getWorstValue()
Returns the worst value for this quality measure. |
void |
initQualityMeasure(double[] classFrequencies,
double allOverRecords)
Some quality measures, like the information gain, calculate a quality of a previous distribution compared to a new one. |
boolean |
isBetter(double quality1,
double quality2)
A gini index is better if it is larger than the other one. |
boolean |
isBetterOrEqual(double quality1,
double quality2)
A GINI index is better if it is larger than the other one. |
double |
measureQuality(double allOverRecords,
double[] partitionFrequency,
double[][] partitionClassFrequency,
double numUnknownRecords)
Calculates the gini split index. |
double |
postProcessMeasure(double qualityMeasure,
double allOverRecords,
double[] partitionFrequency,
double numUnknownRecords)
The gini index need not to post process the measure. |
String |
toString()
|
| Methods inherited from class org.knime.base.node.mine.decisiontree2.learner.SplitQualityMeasure |
|---|
clone |
| Methods inherited from class java.lang.Object |
|---|
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public SplitQualityGini()
| Method Detail |
|---|
public boolean isBetter(double quality1,
double quality2)
isBetter in class SplitQualityMeasurequality1 - first quality to comparequality2 - second quality to compare
public boolean isBetterOrEqual(double quality1,
double quality2)
isBetterOrEqual in class SplitQualityMeasurequality1 - first quality to comparequality2 - second quality to compare
public double measureQuality(double allOverRecords,
double[] partitionFrequency,
double[][] partitionClassFrequency,
double numUnknownRecords)
For a dataset T the gini index is: gini(T) = 1 - SUM(pj * pj) - for all relative class frequencies pj (pj = Pj/|T|). Pj is the absolut class frequency and nx the number of records in the data set
The gini for the split is: giniSplit(T) = SUM(nx/N*gini(Tx)) - for all relative partition frequencies nx/N and all partitions Tx
measureQuality in class SplitQualityMeasureallOverRecords - the allover number of records with known values in
the partition to split; corresponds to N in the formulapartitionFrequency - the frequencies of the different patitions;
corresponds to nx in the formulapartitionClassFrequency - all class frequencies Pj (second
dimension) for all partitions Tx (first dimension *numUnknownRecords - the number of records with unknown (missing)
value of the relevant attribute; used to weight the quality
measure
public double getWorstValue()
getWorstValue in class SplitQualityMeasure
public void initQualityMeasure(double[] classFrequencies,
double allOverRecords)
initQualityMeasure in class SplitQualityMeasureclassFrequencies - the class frequenciesallOverRecords - the overall countpublic String toString()
toString in class SplitQualityMeasure
public double postProcessMeasure(double qualityMeasure,
double allOverRecords,
double[] partitionFrequency,
double numUnknownRecords)
postProcessMeasure in class SplitQualityMeasurequalityMeasure - the quality measure to post processallOverRecords - the allover number of known (non-missing) recordspartitionFrequency - the frequencies of the potential split
partitionsnumUnknownRecords - the number of unknown (missing) records
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||