2011년 4월 30일 토요일

Decision Trees (algorithms): REPTree

  • J48 (C4.5)

J48 algorithm is the Weka implementation of the C4.5 top-down decision tree learner proposed by Quinlan. The algorithm uses the greedy technique and is a variant of ID3, which determines at each step the most predictive attribute, and splits a node based on this attribute. Each node represents a decision point over the value of some attribute. J48 attempts to account for noise and missing data. It also deals with numeric attributes by determining where thresholds for decision splits should be placed. The main parameters that can be set for this algorithm are the confidence threshold, the minimum number of instances per leaf and the number of folds for reduced error pruning.
Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA.
  • REPTree
REPTree algorithm is a fast decision tree learner. It builds a decision/regression tree using information gain/variance and prunes it using reduced-error pruning (with back-fitting). The algorithm only sorts values for numeric attributes once. Missing values are dealt with by splitting the corresponding instances into pieces (i.e. as in C4.5).


REFERENCE 


MyDataMining’s Weblog
http://mydatamining.wordpress.com/2008/04/14/decision-trees/#comments

댓글 1개:

  1. 웹에서 퍼온 것

    RandomTree does no pruning (aside from the simple pre-pruning of
    stopping at a specified depth); REPTree does reduced-error pruning.
    RandomTree considers a set of K randomly chosen attributes to split
    on at each node; REPTree considers all the attributes.

    답글삭제