PROJECT TITLE :
Enhancing Binary Classification by Modeling Uncertain Boundary in Three-Way Decisions - 2017
Text classification is a process of classifying documents into predefined categories through different classifiers learned from labelled or unlabelled coaching samples. Several researchers who work on binary text classification attempt to seek out a a lot of effective manner to separate relevant texts from a massive information set. However, current text classifiers cannot unambiguously describe the choice boundary between positive and negative objects as a result of of uncertainties caused by text feature selection and also the data learning method. This paper proposes a 3-means decision model for addressing the unsure boundary to boost the binary text classification performance based mostly on the rough set techniques and centroid answer. It aims to understand the uncertain boundary through partitioning the coaching samples into three regions (the positive, boundary, and negative regions) by two main boundary vectors CP? and CN? , created from the labeled positive and negative training subsets, respectively, and any resolve the objects in the boundary region by 2 derived boundary vectors BP? and BN? , produced in line with the structure of the boundary region. It involves an indirect strategy which consists of two successive steps in the full classification method: '2-manner to 3-means’ and 'three-manner to two-way’. Four call rules are proposed from the training method and applied to the incoming documents for additional precise classification. A giant range of experiments are conducted primarily based on the quality knowledge sets RCV1 and Reuters-21578. The experimental results show that the usage of boundary vectors is very effective and economical for handling uncertainties of the decision boundary, and therefore the proposed model has considerably improved the performance of binary text classification in terms of F1 live and AUC space compared with six different standard baseline models.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here