PROJECT TITLE :

String Similarity Search: A Hash-Based Approach - 2018

ABSTRACT:

String similarity search is a basic query that has been widely used for DNA sequencing, error-tolerant query autocompletion, and data cleaning required in database, information warehouse, and knowledge mining. During this Project, we study string similarity search based on edit distance that is supported by many database management systems like Oracle and PostgreSQL. Given the edit distance, ed(s, t), between two strings, s and t, the string similarity search is to search out each string t in a string database D which is almost like a query string s such that ed(s, t) = t for a given threshold t. Within the literature, most existing work takes a filter-and-verify approach, where the filter step is introduced to reduce the high verification cost of 2 strings by utilizing an index engineered offline for D. The two up-to-date approaches are prefix filtering and native filtering. During this Project, we study string similarity search where strings will be either short or long. Our approach can support long strings, that are not well supported by the present approaches thanks to the scale of the index designed and also the time to create such index. We have a tendency to propose 2 new hash-primarily based labeling techniques, named OX label and XX label, for string similarity search. We have a tendency to assign a hash-label, H s , to a string s, and prune the dissimilar strings by comparing 2 hash-labels, H s and H t , for two strings s and t within the filter step. The key idea is to take the dissimilar bit-patterns between 2 hash-labels. We have a tendency to discuss our hash-primarily based approaches, address their pruning power, and provide the algorithms. Our hash-based mostly approaches achieve high efficiency, and keep its index size and index construction time one order of magnitude smaller than the present approaches in our experiment at the same time.


Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here


PROJECT TITLE : Encoding high-cardinality string categorical variables ABSTRACT: Vector representations of categorical variables, such as the one-hot encoding used above, are typically required for use in statistical modeling.
PROJECT TITLE :Improved AC-Current Control Based on State Space Control Applied to Solar String Inverters - 2017ABSTRACT:Photovoltaic (PV) energy is one in every of the most vital energy resource since it's clean, pollution free
PROJECT TITLE :On Estimating Instantaneous Temperature of a Supercapacitor String Using an Observer Based on Experimentally Validated Lumped Thermal ModelABSTRACT:The thermal model of energy storage parts like batteries or supercapacitors
PROJECT TITLE : Hop-by-Hop Message Authenticationand Source Privacy in WirelessSensor Networks - 2014 ABSTRACT: Message authentication is one of the most effective ways to thwart unauthorized and corrupted messages from being
PROJECT TITLE : Cross-Layer Approach for Minimizing Routing Disruption in IP Networks - 2014 ABSTRACT: Backup paths are widely used in IP networks to protect IP links from failures. However, existing solutions such as the commonly

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry