PROJECT TITLE :
Diagnosing and Minimizing Semantic Drift in Iterative Bootstrapping Extraction - 2018
Semantic drift is a common problem in iterative information extraction. Previous approaches for minimizing semantic drift may incur substantial loss in recall. We observe that most semantic drifts are introduced by a tiny range of questionable extractions in the earlier rounds of iterations. These extractions subsequently introduce a giant range of questionable results, which result in the semantic drift phenomenon. We tend to call these questionable extractions Drifting Points (DPs). If erroneous extractions are the “symptoms” of semantic drift, then DPs are the “causes” of semantic drift. During this Project, we tend to propose a methodology to minimize semantic drift by identifying the DPs and removing the result introduced by the DPs. We use isA (concept-instance) extraction for example to describe our approach in cleaning information extraction errors caused by semantic drift, but we perform experiments on different relation extraction processes on 3 large real information extraction collections. The experimental results show that our DP cleaning technique permits us to wash around ninety p.c incorrect instances or patterns with concerning ninety % precision, which outperforms the previous approaches we compare with.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here