Automatic Keyword Extraction for Text Summarization: A Survey
📝 Abstract
In recent times, data is growing rapidly in every domain such as news, social media, banking, education, etc. Due to the excessiveness of data, there is a need of automatic summarizer which will be capable to summarize the data especially textual data in original document without losing any critical purposes. Text summarization is emerged as an important research area in recent past. In this regard, review of existing work on text summarization process is useful for carrying out further research. In this paper, recent literature on automatic keyword extraction and text summarization are presented since text summarization process is highly depend on keyword extraction. This literature includes the discussion about different methodology used for keyword extraction and text summarization. It also discusses about different databases used for text summarization in several domains along with evaluation matrices. Finally, it discusses briefly about issues and research challenges faced by researchers along with future direction.
💡 Analysis
In recent times, data is growing rapidly in every domain such as news, social media, banking, education, etc. Due to the excessiveness of data, there is a need of automatic summarizer which will be capable to summarize the data especially textual data in original document without losing any critical purposes. Text summarization is emerged as an important research area in recent past. In this regard, review of existing work on text summarization process is useful for carrying out further research. In this paper, recent literature on automatic keyword extraction and text summarization are presented since text summarization process is highly depend on keyword extraction. This literature includes the discussion about different methodology used for keyword extraction and text summarization. It also discusses about different databases used for text summarization in several domains along with evaluation matrices. Finally, it discusses briefly about issues and research challenges faced by researchers along with future direction.
📄 Content
National Institute of Technology, Rourkela, Odisha 769008 India e-mail: {1sbharti1984, 2prof.ksb}@gmail.com, 3skjena@nitrkl.ac.in
08-February-2017
ABSTRACT
In recent times, data is growing rapidly in every domain such as news, social media, banking, education, etc. Due to the excessiveness of data, there is a need of automatic summarizer which will be capable to summarize the data especially textual data in original document without losing any critical purposes. Text summarization is emerged as an important research area in recent past. In this regard, review of existing work on text summarization process is useful for carrying out further research. In this paper, recent literature on automatic keyword extraction and text summarization are presented since text summarization process is highly depend on keyword extraction. This literature includes the discussion about different methodology used for keyword extraction and text summarization. It also discusses about different databases used for text summarization in several domains along with evaluation matrices. Finally, it discusses briefly about issues and research challenges faced by researchers along with future direction.
Keywords:
Abstractive summary, extractive summary, Keyword Extraction, Natural language processing, Text Summarization.
INTRODUCTION
In the era of internet, plethora of online information are
freely available for readers in the form of e-Newspapers,
journal articles, technical reports, transcription dialogues etc.
There are huge number of documents available in above
digital media and extracting only relevant information from all
these media is a tedious job for the individuals in stipulated
time. There is a need for an automated system that can extract
only relevant information from these data sources. To achieve
this, one need to mine the text from the documents. Text
mining is the process of extracting large quantities of text to
derive high-quality information. Text mining deploys some of
the techniques of natural language processing (NLP) such as
parts-of-speech
(POS)
tagging,
parsing,
N-grams,
tokenization, etc., to perform the text analysis. It includes
tasks
like
automatic
keyword
extraction
and
text
summarization.
Automatic keyword extraction is the process of selecting
words and phrases from the text document that can at best
project the core sentiment of the document without any human
intervention depending on the model [1]. The target of
automatic keyword extraction is the application of the power
and speed of current computation abilities to the problem of
access and recovery, stressing upon information organization
without the added costs of human annotators.
Summarization is a process where the most salient features of
a text are extracted and compiled into a short abstract of the
original document [2]. According to Mani and Maybury [3],
text summarization is the process of distilling the most
important information from a text to produce an abridged
version for a particular task and user. Summaries are usually
around 17% of the original text and yet contain everything
that could have been learned from reading the original article
[4]. In the wake of big data analysis, summarization is an
efficient and powerful technique to give a glimpse of the
whole data. The text summarization can be achieved in two
ways namely, abstractive summary and extractive summary.
The abstractive summary is a topic under tremendous
research; however, no standard algorithm has been achieved
yet. These summaries are derived from learning what was
expressed in the article and then converting it into a form
expressed by the computer. It resembles how a human would
summarize an article after reading it. Whereas, extractive
summary extract details from the original article itself and
present it to the reader.
Automatic Keyword Extraction for Text
Summarization: A Survey
Santosh Kumar Bharti1, Korra Sathya Babu2, and Sanjay Kumar Jena3
Automatic Keyword Extraction for Text Summarization: A Survey
2
In this paper, reviewed the recent literature on automatic
keyword extraction and text summarization. The valuable
keywords
extraction
is
the
primary
phase
of
text
summarization. Therefore, in this literature, we focused on
both the techniques. In keyword extraction, the literature
discussed about different methodologies used for keyword
extraction process and what algorithms used under each
methodology as shown in Figure 1. It also discussed about
different domains in which keyword extraction algorithms
applied. Similarly, in the process of text summarization,
literature covers all the possible pr
This content is AI-processed based on ArXiv data.