Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems - 2016
A vast amount of text knowledge is recorded in the styles of repair verbatim in railway maintenance sectors. Economical text mining of such maintenance information plays an important role in detecting anomalies and improving fault diagnosis potency. However, unstructured verbatim, high-dimensional information, and imbalanced fault class distribution cause challenges for feature choices and fault diagnosis. We tend to propose a bilevel feature extraction-based mostly text mining that integrates options extracted at each syntax and semantic levels with the aim to boost the fault classification performance. We tend to first perform an improved X2 statistics-primarily based feature selection at the syntax level to overcome the learning difficulty caused by an imbalanced information set. Then, we perform a prior latent Dirichlet allocation-based mostly feature choice at the semantic level to scale back the information set into a low-dimensional topic house. Finally, we tend to fuse fault options derived from both syntax and semantic levels via serial fusion. The proposed methodology uses fault features at different levels and enhances the precision of fault diagnosis for all fault classes, notably minority ones. Its performance has been validated by employing a railway maintenance information set collected from 2008 to 2014 by a railway corporation. It outperforms traditional approaches.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here