PROJECT TITLE :
New Splitting Criteria for Decision Trees in Stationary Data Streams - 2017
The most standard tools for stream knowledge mining are based mostly on call trees. In previous fifteen years, all designed methods, headed by the terribly quick decision tree algorithm, relayed on Hoeffding's inequality and tons of researchers followed this scheme. Recently, we have demonstrated that although the Hoeffding call trees are a good tool for managing stream knowledge, they're a purely heuristic procedure; for instance, classical call trees such as ID3 or CART can't be adopted to information stream mining using Hoeffding's inequality. So, there is an urgent need to develop new algorithms, that are each mathematically justified and characterized by smart performance. In this paper, we address this downside by developing a family of recent splitting criteria for classification in stationary information streams and investigating their probabilistic properties. The new criteria, derived using acceptable statistical tools, are primarily based on the misclassification error and also the Gini index impurity measures. The final division of splitting criteria into 2 sorts is proposed. Attributes chosen based on type-I splitting criteria guarantee, with high probability, the best expected value of split live. Sort-II criteria guarantee that the chosen attribute is the identical, with high probability, as it'd be chosen primarily based on the full infinite knowledge stream. Moreover, in this paper, 2 hybrid splitting criteria are proposed, which are the combos of single criteria based mostly on the misclassification error and Gini index.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here