Automatic Keyword and Sentence-Based Text PDF/DOC Summarization for Software Bug Reports


Text summarization is a method of extracting essential information from papers quickly and efficiently. The proposed unsupervised method aims to summarize bug reports (software artefacts) with entire content and diverse information. To extract meaningful keywords and key-phrases with a relevant score, the suggested method employs Rapid Automatic Keyword Extraction and the term frequency-inverse document frequency method. Fuzzy C-means clustering is used to extract sentences from each cluster that have a high degree of membership above a certain threshold value. Sentence selection is handled by a rule-engine. The rules are created using domain knowledge and are based on the information gathered using the clustering method's keywords and sentences. The proposed method on apache bug reports generates a cohesive and consistent summary. Hierarchical clustering is offered to enrich the derived summary by removing redundancy and re-ranking it. On the newly developed Apache project Bug Report Corpus (APBRC) as well as existing Bug Report Corpus, the proposed approach is examined (BRC). Precision, recall, pyramid precision, and F-score are some of the performance indicators used to compare the outcomes. Our proposed approach outperforms previous baseline approaches such as BRC and LRCA, according to the experimental data. It also outperforms state-of-the-art unsupervised methods like Hurried, centroid, and others. To provide complete coverage and a coherent summary, it extracts essential keyword phrases and sentences from each cluster. For precision, recall, f-score, and pyramid precision, the findings evaluated on the APBRC corpus achieve an average value of 78.22%, 82.18%, 80.10%, and 81.66%, respectively.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : Automatic Keyword Extraction for Text Summarization A Survey ABSTRACT: Data has been quickly rising in recent years in every sphere, including journalism, social media, banking, education, and so on. Due to the
PROJECT TITLE : News Text Summarization Based on Multi-Feature and Fuzzy Logic ABSTRACT: Because the amount of data on the Internet is growing so quickly, automatic text summarization work has become increasingly important in
PROJECT TITLE : Twitter and Research A Systematic Literature Review Through Text Mining ABSTRACT: Researchers have gathered Twitter data to investigate a variety of subjects. This growing body of knowledge, however, has yet to
PROJECT TITLE : Financial Latent Dirichlet Allocation (FinLDA) Feature Extraction in Text and Data Mining for Financial Time Series Prediction ABSTRACT: Many financial time series predictions based on fundamental analysis have
PROJECT TITLE : Fundamental Visual Concept Learning From Correlated Images and Textí_ ABSTRACT: The visual notions in heterogeneous web media, such as objects, situations, and activities, cannot be dissected semantically. Learning

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry