PROJECT TITLE :
Automatic Keyword and Sentence-Based Text PDF/DOC Summarization for Software Bug Reports
Text summarization is a method of extracting essential information from papers quickly and efficiently. The proposed unsupervised method aims to summarize bug reports (software artefacts) with entire content and diverse information. To extract meaningful keywords and key-phrases with a relevant score, the suggested method employs Rapid Automatic Keyword Extraction and the term frequency-inverse document frequency method. Fuzzy C-means clustering is used to extract sentences from each cluster that have a high degree of membership above a certain threshold value. Sentence selection is handled by a rule-engine. The rules are created using domain knowledge and are based on the information gathered using the clustering method's keywords and sentences. The proposed method on apache bug reports generates a cohesive and consistent summary. Hierarchical clustering is offered to enrich the derived summary by removing redundancy and re-ranking it. On the newly developed Apache project Bug Report Corpus (APBRC) as well as existing Bug Report Corpus, the proposed approach is examined (BRC). Precision, recall, pyramid precision, and F-score are some of the performance indicators used to compare the outcomes. Our proposed approach outperforms previous baseline approaches such as BRC and LRCA, according to the experimental data. It also outperforms state-of-the-art unsupervised methods like Hurried, centroid, and others. To provide complete coverage and a coherent summary, it extracts essential keyword phrases and sentences from each cluster. For precision, recall, f-score, and pyramid precision, the findings evaluated on the APBRC corpus achieve an average value of 78.22%, 82.18%, 80.10%, and 81.66%, respectively.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here