📝 Original Info
- Title: Fuzzy Logic Based Method for Improving Text Summarization
- ArXiv ID: 0906.4690
- Date: 2009-06-26
- Authors: Researchers from original ArXiv paper
📝 Abstract
Text summarization can be classified into two approaches: extraction and abstraction. This paper focuses on extraction approach. The goal of text summarization based on extraction approach is sentence selection. One of the methods to obtain the suitable sentences is to assign some numerical measure of a sentence for the summary called sentence weighting and then select the best ones. The first step in summarization by extraction is the identification of important features. In our experiment, we used 125 test documents in DUC2002 data set. Each document is prepared by preprocessing process: sentence segmentation, tokenization, removing stop word, and word stemming. Then, we use 8 important features and calculate their score for each sentence. We propose text summarization based on fuzzy logic to improve the quality of the summary created by the general statistic method. We compare our results with the baseline summarizer and Microsoft Word 2007 summarizers. The results show that the best average precision, recall, and f-measure for the summaries were obtained by fuzzy method.
💡 Deep Analysis
Deep Dive into Fuzzy Logic Based Method for Improving Text Summarization.
Text summarization can be classified into two approaches: extraction and abstraction. This paper focuses on extraction approach. The goal of text summarization based on extraction approach is sentence selection. One of the methods to obtain the suitable sentences is to assign some numerical measure of a sentence for the summary called sentence weighting and then select the best ones. The first step in summarization by extraction is the identification of important features. In our experiment, we used 125 test documents in DUC2002 data set. Each document is prepared by preprocessing process: sentence segmentation, tokenization, removing stop word, and word stemming. Then, we use 8 important features and calculate their score for each sentence. We propose text summarization based on fuzzy logic to improve the quality of the summary created by the general statistic method. We compare our results with the baseline summarizer and Microsoft Word 2007 summarizers. The results show that the bes
📄 Full Content
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 2, No. 1, 2009
Fuzzy Logic Based Method for Improving Text
Summarization
Ladda Suanmali1, Naomie Salim2 and Mohammed Salem Binwahlan3
1Faculty of Science and Technology, Suan Dusit Rajabhat University, Bangkok, Thailand 10300
2,3Faculty of Computer Science and Information System, Universiti Teknologi Malaysia 81310
E-mail: 1ladda_sua@dusit.ac.th, 2 naomie@utm.my, 3 moham2007med@yahoo.com
Abstract—Text summarization can be classified into two
approaches: extraction and abstraction. This paper focuses on
extraction approach. The goal of text summarization based on
extraction approach is sentence selection. One of the methods to
obtain the suitable sentences is to assign some numerical measure
of a sentence for the summary called sentence weighting and then
select the best ones. The first step in summarization by
extraction is the identification of important features. In our
experiment, we used 125 test documents in DUC2002 data set.
Each document is prepared by preprocessing process: sentence
segmentation, tokenization, removing stop word, and word
stemming. Then, we used 8 important features and calculate their
score for each sentence. We proposed text summarization based
on fuzzy logic to improve the quality of the summary created by
the general statistic method. We compared our results with the
baseline summarizer and Microsoft Word 2007 summarizers.
The results show that the best average precision, recall, and f-
measure for the summaries were obtained by fuzzy method.
Keywords- fuzzy logic; sentence feature; text summarization
I. INTRODUCTION
An increasingly important task in the current era of
information overload, text summarization has become an
important and timely tool for helping and interpreting the large
volumes of text available in documents.
The goal of text summarization is to present the most
important information in a shorter version of the original text
while keeping its main content and helps the user to quickly
understand large volumes of information. Text summarization
addresses both the problem of selecting the most important
sections of text and the problem of generating coherent
summaries. This process is significantly different from that of
human based text summarization since human can capture and
relate deep meanings and themes of text documents while
automation of such a skill is very difficult to implement.
Automatic text summarization researchers since Luhn work
[1], they are trying to solve or at least relieve that problem by
proposing techniques for generating summaries. The
summaries serve as quick guide to interesting information,
providing a short form for each document in the document set;
reading summary makes decision about reading the whole
document or not, it also serves as time saver. A number of
researchers have proposed techniques for automatic text
summarization which can be classified into two categories:
extraction and abstraction. Extraction summary is a selection
of sentences or phrases from the original text with the highest
score and put it together to a new shorter text without
changing the source text. Abstraction summary method uses
linguistic methods to examine and interpret the text. Most of
the current automated text summarization system use
extraction method to produce summary. Automatic text
summarization works best on well-structured documents, such
as news, reports, articles and scientific papers.
The first step in summarization by extraction is the
identification of important features such as sentence length,
sentence location [11], term frequency [6], number of words
occurring in title [5], number of proper nouns [14] and number
of numerical data [13]. In our approach, we utilize a feature
fusion technique to discover which features out of the
available ones are most useful.
In this paper, we propose text summarization based on
fuzzy logic method to extract important sentences as a
summary. The rest of this paper is organized as follows.
Section II presents the summarization approach. Section III
describes preprocessing and the important features. Section IV
and V describes our proposed, followed by experimental
design, experimental results and evaluation. Finally, we
conclude and suggest future work that can be carried out in
Section VI.
II. SUMMARIZATION APPROACHES
In early classic summarization system, the important
summaries were created according to the most frequent words
in the text. Luhn created the first summarization system [1] in
1958. Rath et al. [2] in 1961 proposed empirical evidences for
difficulties inherent in the notion of ideal summary. Both
studies used thematic features such as term frequency, thus
they are characterized by surface-level approaches. In the early
1960s, new approaches called entity-level approaches
appeared; the first approach of t
…(Full text truncated)…
Reference
This content is AI-processed based on ArXiv data.