Fuzzy Logic Based Method for Improving Text Summarization

Reading time: 6 minute
...

📝 Original Info

  • Title: Fuzzy Logic Based Method for Improving Text Summarization
  • ArXiv ID: 0906.4690
  • Date: 2009-06-26
  • Authors: Researchers from original ArXiv paper

📝 Abstract

Text summarization can be classified into two approaches: extraction and abstraction. This paper focuses on extraction approach. The goal of text summarization based on extraction approach is sentence selection. One of the methods to obtain the suitable sentences is to assign some numerical measure of a sentence for the summary called sentence weighting and then select the best ones. The first step in summarization by extraction is the identification of important features. In our experiment, we used 125 test documents in DUC2002 data set. Each document is prepared by preprocessing process: sentence segmentation, tokenization, removing stop word, and word stemming. Then, we use 8 important features and calculate their score for each sentence. We propose text summarization based on fuzzy logic to improve the quality of the summary created by the general statistic method. We compare our results with the baseline summarizer and Microsoft Word 2007 summarizers. The results show that the best average precision, recall, and f-measure for the summaries were obtained by fuzzy method.

💡 Deep Analysis

Deep Dive into Fuzzy Logic Based Method for Improving Text Summarization.

Text summarization can be classified into two approaches: extraction and abstraction. This paper focuses on extraction approach. The goal of text summarization based on extraction approach is sentence selection. One of the methods to obtain the suitable sentences is to assign some numerical measure of a sentence for the summary called sentence weighting and then select the best ones. The first step in summarization by extraction is the identification of important features. In our experiment, we used 125 test documents in DUC2002 data set. Each document is prepared by preprocessing process: sentence segmentation, tokenization, removing stop word, and word stemming. Then, we use 8 important features and calculate their score for each sentence. We propose text summarization based on fuzzy logic to improve the quality of the summary created by the general statistic method. We compare our results with the baseline summarizer and Microsoft Word 2007 summarizers. The results show that the bes

📄 Full Content

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 2, No. 1, 2009 Fuzzy Logic Based Method for Improving Text Summarization

Ladda Suanmali1, Naomie Salim2 and Mohammed Salem Binwahlan3 1Faculty of Science and Technology, Suan Dusit Rajabhat University, Bangkok, Thailand 10300 2,3Faculty of Computer Science and Information System, Universiti Teknologi Malaysia 81310
E-mail: 1ladda_sua@dusit.ac.th, 2 naomie@utm.my, 3 moham2007med@yahoo.com

Abstract—Text summarization can be classified into two approaches: extraction and abstraction. This paper focuses on extraction approach. The goal of text summarization based on extraction approach is sentence selection. One of the methods to obtain the suitable sentences is to assign some numerical measure of a sentence for the summary called sentence weighting and then select the best ones. The first step in summarization by extraction is the identification of important features. In our experiment, we used 125 test documents in DUC2002 data set. Each document is prepared by preprocessing process: sentence segmentation, tokenization, removing stop word, and word stemming. Then, we used 8 important features and calculate their score for each sentence. We proposed text summarization based on fuzzy logic to improve the quality of the summary created by the general statistic method. We compared our results with the baseline summarizer and Microsoft Word 2007 summarizers.
The results show that the best average precision, recall, and f- measure for the summaries were obtained by fuzzy method. Keywords- fuzzy logic; sentence feature; text summarization I. INTRODUCTION An increasingly important task in the current era of information overload, text summarization has become an important and timely tool for helping and interpreting the large volumes of text available in documents.
The goal of text summarization is to present the most important information in a shorter version of the original text while keeping its main content and helps the user to quickly understand large volumes of information. Text summarization addresses both the problem of selecting the most important sections of text and the problem of generating coherent summaries. This process is significantly different from that of human based text summarization since human can capture and relate deep meanings and themes of text documents while automation of such a skill is very difficult to implement. Automatic text summarization researchers since Luhn work [1], they are trying to solve or at least relieve that problem by proposing techniques for generating summaries. The summaries serve as quick guide to interesting information, providing a short form for each document in the document set; reading summary makes decision about reading the whole document or not, it also serves as time saver. A number of researchers have proposed techniques for automatic text summarization which can be classified into two categories: extraction and abstraction. Extraction summary is a selection of sentences or phrases from the original text with the highest score and put it together to a new shorter text without changing the source text. Abstraction summary method uses linguistic methods to examine and interpret the text. Most of the current automated text summarization system use extraction method to produce summary. Automatic text summarization works best on well-structured documents, such as news, reports, articles and scientific papers. The first step in summarization by extraction is the identification of important features such as sentence length, sentence location [11], term frequency [6], number of words occurring in title [5], number of proper nouns [14] and number of numerical data [13]. In our approach, we utilize a feature fusion technique to discover which features out of the available ones are most useful. In this paper, we propose text summarization based on fuzzy logic method to extract important sentences as a summary. The rest of this paper is organized as follows. Section II presents the summarization approach. Section III describes preprocessing and the important features. Section IV and V describes our proposed, followed by experimental design, experimental results and evaluation. Finally, we conclude and suggest future work that can be carried out in Section VI. II. SUMMARIZATION APPROACHES In early classic summarization system, the important summaries were created according to the most frequent words in the text. Luhn created the first summarization system [1] in 1958. Rath et al. [2] in 1961 proposed empirical evidences for difficulties inherent in the notion of ideal summary. Both studies used thematic features such as term frequency, thus they are characterized by surface-level approaches. In the early 1960s, new approaches called entity-level approaches appeared; the first approach of t

…(Full text truncated)…

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut