Automatic Keyword Extraction for Text Summarization: A Survey

Reading time: 5 minute
...

📝 Abstract

In recent times, data is growing rapidly in every domain such as news, social media, banking, education, etc. Due to the excessiveness of data, there is a need of automatic summarizer which will be capable to summarize the data especially textual data in original document without losing any critical purposes. Text summarization is emerged as an important research area in recent past. In this regard, review of existing work on text summarization process is useful for carrying out further research. In this paper, recent literature on automatic keyword extraction and text summarization are presented since text summarization process is highly depend on keyword extraction. This literature includes the discussion about different methodology used for keyword extraction and text summarization. It also discusses about different databases used for text summarization in several domains along with evaluation matrices. Finally, it discusses briefly about issues and research challenges faced by researchers along with future direction.

💡 Analysis

In recent times, data is growing rapidly in every domain such as news, social media, banking, education, etc. Due to the excessiveness of data, there is a need of automatic summarizer which will be capable to summarize the data especially textual data in original document without losing any critical purposes. Text summarization is emerged as an important research area in recent past. In this regard, review of existing work on text summarization process is useful for carrying out further research. In this paper, recent literature on automatic keyword extraction and text summarization are presented since text summarization process is highly depend on keyword extraction. This literature includes the discussion about different methodology used for keyword extraction and text summarization. It also discusses about different databases used for text summarization in several domains along with evaluation matrices. Finally, it discusses briefly about issues and research challenges faced by researchers along with future direction.

📄 Content

 National Institute of Technology, Rourkela, Odisha 769008 India e-mail: {1sbharti1984, 2prof.ksb}@gmail.com, 3skjena@nitrkl.ac.in

                                                                                             08-February-2017 

ABSTRACT

In recent times, data is growing rapidly in every domain such as news, social media, banking, education, etc. Due to the excessiveness of data, there is a need of automatic summarizer which will be capable to summarize the data especially textual data in original document without losing any critical purposes. Text summarization is emerged as an important research area in recent past. In this regard, review of existing work on text summarization process is useful for carrying out further research. In this paper, recent literature on automatic keyword extraction and text summarization are presented since text summarization process is highly depend on keyword extraction. This literature includes the discussion about different methodology used for keyword extraction and text summarization. It also discusses about different databases used for text summarization in several domains along with evaluation matrices. Finally, it discusses briefly about issues and research challenges faced by researchers along with future direction.

Keywords:

Abstractive summary, extractive summary, Keyword Extraction, Natural language processing, Text Summarization.

  1. INTRODUCTION    
    

In the era of internet, plethora of online information are freely available for readers in the form of e-Newspapers, journal articles, technical reports, transcription dialogues etc. There are huge number of documents available in above digital media and extracting only relevant information from all these media is a tedious job for the individuals in stipulated time. There is a need for an automated system that can extract only relevant information from these data sources. To achieve this, one need to mine the text from the documents. Text mining is the process of extracting large quantities of text to derive high-quality information. Text mining deploys some of the techniques of natural language processing (NLP) such as parts-of-speech (POS) tagging, parsing, N-grams, tokenization, etc., to perform the text analysis. It includes tasks like automatic keyword extraction and text summarization.
Automatic keyword extraction is the process of selecting words and phrases from the text document that can at best project the core sentiment of the document without any human intervention depending on the model [1]. The target of automatic keyword extraction is the application of the power and speed of current computation abilities to the problem of access and recovery, stressing upon information organization without the added costs of human annotators.
Summarization is a process where the most salient features of a text are extracted and compiled into a short abstract of the original document [2]. According to Mani and Maybury [3], text summarization is the process of distilling the most important information from a text to produce an abridged version for a particular task and user. Summaries are usually around 17% of the original text and yet contain everything that could have been learned from reading the original article [4]. In the wake of big data analysis, summarization is an efficient and powerful technique to give a glimpse of the whole data. The text summarization can be achieved in two ways namely, abstractive summary and extractive summary. The abstractive summary is a topic under tremendous research; however, no standard algorithm has been achieved yet. These summaries are derived from learning what was expressed in the article and then converting it into a form expressed by the computer. It resembles how a human would summarize an article after reading it. Whereas, extractive summary extract details from the original article itself and present it to the reader. Automatic Keyword Extraction for Text Summarization: A Survey Santosh Kumar Bharti1, Korra Sathya Babu2, and Sanjay Kumar Jena3 Automatic Keyword Extraction for Text Summarization: A Survey 2 In this paper, reviewed the recent literature on automatic keyword extraction and text summarization. The valuable keywords extraction is the primary phase of text
summarization. Therefore, in this literature, we focused on both the techniques. In keyword extraction, the literature discussed about different methodologies used for keyword extraction process and what algorithms used under each methodology as shown in Figure 1. It also discussed about different domains in which keyword extraction algorithms applied. Similarly, in the process of text summarization, literature covers all the possible pr

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut