Predicting disease-related genes by path-based similarity and community structure in protein-protein interaction network

February 23, 2026

Reading time: 4 minute

...

📝 Abstract

Network-based computational approaches to predict unknown genes associated with certain diseases are of considerable significance for uncovering the molecular basis of human diseases. In this paper, we proposed a kind of new disease-gene-prediction methods by combining the path-based similarity with the community structure in the human protein-protein interaction network. Firstly, we introduced a set of path-based similarity indices, a novel community-based similarity index, and a new similarity combining the path-based similarity index. Then we assessed the statistical significance of the measures in distinguishing the disease genes from non-disease genes, to confirm their availability in predicting disease genes. Finally, we applied these measures to the disease-gene prediction of single disease-gene family, and analyzed the performance of these measures in disease-gene prediction, especially the effect of the community structure on the prediction performance in detail. The results indicated that genes associated with the same or similar diseases commonly reside in the same community of the protein-protein interaction network, and the community structure is greatly helpful for the disease-gene prediction.

💡 Analysis

🇰🇷 한글로 읽기

📄 Content

Manuscript Thursday, July 20, 2017 1

Predicting disease-related genes by path-based similarity and community structure in protein-protein interaction network Ke Hu1, Jing-Bo Hu1, Ju Xiang2,, Hui-Jia Li3,4,, Yan Zhang5,*, Shi Chen5, Chen-He Yi7 1Department of Physics, Xiangtan University Xiangtan, Xiangtan 411105, Hunan China 2Neuroscience Research Center & Department of Basic Medical Sciences, Changsha Medical University, Changsha 410219, Hunan, China
3School of Management Science and Engineering, Central University of Finance and Economics, Beijing 100080, China
4Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
5Department of Computer, Changsha Medical University, Changsha 410219, Hunan, China
6School of Public Administration, Xiangtan University, Xiangtan 411105, Hunan, China

Corresponding authors: Ju Xiang or Hui-Jia Li or Yan Zhang.
E-mail: xiang.ju@foxmail.com (J.X.); xiangju@aliyun.com (J.X.); hjli@amss.ac.cn (H.J.L.); zhangyancsmu@foxmail.com (Y.Z.); huke1998@aliyun.com (K.H); chenshi198001@qq.com (S.C.)
Abstract: Network-based computational approaches to predict unknown genes associated with certain diseases are of considerable significance for uncovering the molecular basis of human diseases. In this paper, we proposed a kind of new disease-gene-prediction methods by combining the path-based similarity with the community structure in the human protein-protein interaction network. Firstly, we introduced a set of path-based similarity indices, a novel community-based similarity index, and a new similarity combining the path-based similarity index. Then we assessed the statistical significance of the measures in distinguishing the disease genes from non-disease genes, to confirm their availability in predicting disease genes. Finally, we applied these measures to the disease-gene prediction of single disease-gene family, and analyzed the performance of these measures in disease-gene prediction, especially the effect of the community structure on the prediction performance in detail. The results indicated that genes associated with the same or similar diseases commonly reside in the same community of the protein-protein interaction network, and the community structure is greatly helpful for the disease-gene prediction.

PACS: 89.75.–k; 89.75.Fb; 89.75.Hc Keywords: Complex networks; Community structure; Topological similarity; Protein-protein interaction networks; Disease genes

Manuscript Thursday, July 20, 2017 2

CONTENTS

Introduction …………………………………………………………………………………………………………………………………. 2
Datasets ……………………………………………………………………………………………………………………………………….. 3 2.1. Human PPI Datasets …………………………………………………………………………………………………………… 3 2.2. Disease-Gene Data ………………………………………………………………………………………………………………. 3
Methods ……………………………………………………………………………………………………………………………………….. 4 3.1. Definition of the topological similarity …………………………………………………………………………………. 4 3.1.1. Path-based Similarity (PS)…………………………………………………………………………………………. 4 3.1.2. Community-based Similarity (CS) ……………………………………………………………………………… 5 3.1.3. Combined similarity based on path structure and community structure ……………………… 6 3.2. Similarity scores of genes with respect to the disease genes …………………………………………………… 7 3.3. Metric ………………………………………………………………………………………………………………………………… 7
Experimental results …………………………………………………………………………………………………………………….. 8 4.1. Analysis of feasibility …………………………………………………………………………………………………………… 8 4.2. Performance of method……………………………………………………………………………………………………… 10 4.2.1 ROC and AUC …………………………………………………………………………………………………………. 10 4.2.2. Precision …………………………………………..

View Original ArXiv

This content is AI-processed based on ArXiv data.

Predicting disease-related genes by path-based similarity and community structure in protein-protein interaction network

📝 Abstract

💡 Analysis

📄 Content

Table of Contents

Table of Contents

📝 Abstract

💡 Analysis

📄 Content

Start searching

No results found