Profile photo for Johannes Petrat

In addition to Dennis's Answer (which can be found in full at Different decision tree algorithms with comparison of complexity or performance) I would also like to mention one more difference. As you probably know Machine Learning algorithms usually try to minimise a cost (or loss) function of some sort. When training a decision tree it has to figure out which features are "more important" for the classifications so that these can be checked for earlier in the tree. And ID3 and CART use different criteria (think cost functions) to achieve this:

  • ID3 and C.4.5 use Shannon Entropy to pick features with the greatest information gain as nodes. As an example, let's say we would like to classify animals. You would probably ask more general questions (like "Is it a mammal") first and once confirmed continue with more specific questions (like "is it a monkey"). In terms of information gain the general questions of our toy example gives you more information in addition to what you already know (that it is an animal).
  • CART uses Gini Impurity instead. Gini Impurity is a measure of the homogeneity (or "purity") of the nodes. If all datapoints at one node belong to the same class then this node is considered "pure". So by minimising the Gini Impurity the decision tree finds the features the separate the data best.
View 2 other answers to this question
About · Careers · Privacy · Terms · Contact · Languages · Your Ad Choices · Press ·
© Quora, Inc. 2025