A Comparative Study for Predicting Heart Diseases Using Data Mining Classification Methods

Reading time: 6 minute
...

📝 Abstract

Improving the precision of heart diseases detection has been investigated by many researchers in the literature. Such improvement induced by the overwhelming health care expenditures and erroneous diagnosis. As a result, various methodologies have been proposed to analyze the disease factors aiming to decrease the physicians practice variation and reduce medical costs and errors. In this paper, our main motivation is to develop an effective intelligent medical decision support system based on data mining techniques. In this context, five data mining classifying algorithms, with large datasets, have been utilized to assess and analyze the risk factors statistically related to heart diseases in order to compare the performance of the implemented classifiers (e.g., Na"ive Bayes, Decision Tree, Discriminant, Random Forest, and Support Vector Machine). To underscore the practical viability of our approach, the selected classifiers have been implemented using MATLAB tool with two datasets. Results of the conducted experiments showed that all classification algorithms are predictive and can give relatively correct answer. However, the decision tree outperforms other classifiers with an accuracy rate of 99.0% followed by Random forest. That is the case because both of them have relatively same mechanism but the Random forest can build ensemble of decision tree. Although ensemble learning has been proved to produce superior results, but in our case the decision tree has outperformed its ensemble version.

💡 Analysis

Improving the precision of heart diseases detection has been investigated by many researchers in the literature. Such improvement induced by the overwhelming health care expenditures and erroneous diagnosis. As a result, various methodologies have been proposed to analyze the disease factors aiming to decrease the physicians practice variation and reduce medical costs and errors. In this paper, our main motivation is to develop an effective intelligent medical decision support system based on data mining techniques. In this context, five data mining classifying algorithms, with large datasets, have been utilized to assess and analyze the risk factors statistically related to heart diseases in order to compare the performance of the implemented classifiers (e.g., Na"ive Bayes, Decision Tree, Discriminant, Random Forest, and Support Vector Machine). To underscore the practical viability of our approach, the selected classifiers have been implemented using MATLAB tool with two datasets. Results of the conducted experiments showed that all classification algorithms are predictive and can give relatively correct answer. However, the decision tree outperforms other classifiers with an accuracy rate of 99.0% followed by Random forest. That is the case because both of them have relatively same mechanism but the Random forest can build ensemble of decision tree. Although ensemble learning has been proved to produce superior results, but in our case the decision tree has outperformed its ensemble version.

📄 Content

International Journal of Computer Science and Information Security (IJCSIS), Vol. 14, No. 12, December 2016 868 ttps://sites.google.com/site/ijcsis/
ISSN 1947-5500

A Comparative Study for Predicting Heart Diseases Using Data Mining Classification Methods
Isra’a Ahmed Zriqat, Ahmad Mousa Altamimi, Mohammad Azzeh
Faculty of Information Technology Applied Science Private University Amman, Jordan {i_zriqat, a_altamimi, m.y.azzah}@asu.edu.jo

Abstract- Improving the precision of heart diseases detection has been investigated by many researchers in the literature. Such improvement induced by the overwhelming health care expenditures and erroneous diagnosis. As a result, various methodologies have been proposed to analyze the disease factors aiming to decrease the physicians practice variation and reduce medical costs and errors. In this paper, our main motivation is to develop an effective intelligent medical decision support system based on data mining techniques. In this context, five data mining classifying algorithms, with large datasets, have been utilized to assess and analyze the risk factors statistically related to heart diseases in order to compare the performance of the implemented classifiers (e.g., Naïve Bayes, Decision Tree, Discriminant, Random Forest, and Support Vector Machine). To underscore the practical viability of our approach, the selected classifiers have been implemented using MATLAB tool with two datasets. Results of the conducted experiments showed that all classification algorithms are predictive and can give relatively correct answer. However, the decision tree outperforms other classifiers with an accuracy rate of 99.0% followed by Random forest. That is the case because both of them have relatively same mechanism but the Random forest can build ensemble of decision tree. Although ensemble learning has been proved to produce superior results, but in our case the decision tree has outperformed its ensemble version. Keywords- Heart Diseases; Prediction Systems; Data Mining Classifiers; Ensemble Learning; Decision Tree I. INTRODUCTION Data mining techniques have been widely used for variety of applications. In health care industry for example, data mining plays an important role for predicting or diagnosing diseases with good accuracy. One important application is to diagnose the heart diseases or cardiovascular as these diseases are recognized as the leading cause of death globally in our modern world [1]. According to the World Heart Federation and the World Health Organization, more than 17 million people died from cardiovascular diseases in 2013, and around 3 million of these deaths occurred before the age of 60 [2]. However, 90% of those deaths were estimated to be preventable if patients have correctly been diagnosed early and they improved their habits such as: healthy eating, exercise, and alike [3].
In traditional healthcare environments, diagnosis of a disease depends on doctor’s decision for identifying it as the most likely cause depending on a person’s symptoms. However, this leads to unwanted errors that resulting on more medical costs and affecting the quality of service provided to patients. Instead, International Journal of Computer Science and Information Security (IJCSIS), Vol. 14, No. 12, December 2016 869 ttps://sites.google.com/site/ijcsis/
ISSN 1947-5500

expert systems (that use Data mining techniques) [4] could be used to emulate the decision-making ability of a human expert for answering not only simple questions like “What is the average age of patients who have heart disease?”, “Identify the female patients who are single, and who have been treated for heart diseases?”, but also complex ones like “Given patient records, predict the probability of patients who diagnosed a heart disease?”, “Find the most significant risk factor that results a heart disease?”. Off course, using such systems could reduce medical errors, and decrease practice variation, but surprisingly it can improve diagnose results.
Techniques of data mining can be used for discovering knowledge in huge volumes of data through detecting patterns and summarizing data into a format that can be understood. In fact, there are three main techniques of data mining that can be utilized to classify previously unorganized data into predefin

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut