Purity in the Column Importance context is a measure of correctness of the label value distribution. The cumulative purity is a measure of how well the data is partitioned in reference to the label values. The data is partitioned using columns found as important in the same way data is partitioned in a Decision Tree. Each set in the partition has its own purity measure, and the purity measure within the partition is a combination of these individual measures. For a given set in the partition, the purity is 0 if each class has equal representation, and 100 if every record is of the same class. Similarly, the cumulative purity will be 0 if each set in the partition has an equal representation of classes, and 100 if each set in the partition contains record that all have the same class. In MineSet, purity is based on Entropy.