Web information extraction has been an vital part for several Web data analysis applications. In this paper, we tend to formulate the info extraction problem because the decoding process of page generation based mostly on structured data and tree templates. We tend to propose an unsupervised, page-level data extraction approach to deduce the schema and templates for every individual Deep Website, which contains singleton or multiple data, records in one Webpage. FiVaTech applies tree matching, tree alignment, and mining techniques to attain the difficult task. In experiments, FiVaTech has much higher precision than EXALG and is comparable with alternative record-level extraction systems like ViPER and MSE. The experiments show an encouraging result for the check pages used in several state-of-the-art Web data extraction works.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here