SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information
Liu, Xuhan1; Yang, Shiping1; Li, Chen2,3; Zhang, Ziding1; Song, Jiangning2,3,4,5,6
刊名AMINO ACIDS
2016-07-01
卷号48期号:7页码:1655-1665
关键词Self-interacting protein Prediction Machine learning Feature selection Domain-domain interaction
英文摘要Protein self-interaction, i.e. the interaction between two or more identical proteins expressed by one gene, plays an important role in the regulation of cellular functions. Considering the limitations of experimental self-interaction identification, it is necessary to design specific bioinformatics tools for self-interacting protein (SIP) prediction from protein sequence information. In this study, we proposed an improved computational approach for SIP prediction, termed SPAR (Self-interacting Protein Analysis serveR). Firstly, we developed an improved encoding scheme named critical residues substitution (CRS), in which the fine-grained domain-domain interaction information was taken into account. Then, by employing the Random Forest algorithm, the performance of CRS was evaluated and compared with several other encoding schemes commonly used for sequence-based protein-protein interaction prediction. Through the tenfold cross-validation tests on a balanced training dataset, CRS performed the best, with the average accuracy up to 72.01 %. We further integrated CRS with other encoding schemes and identified the most important features using the mRMR (the minimum redundancy maximum relevance) feature selection method. Our SPAR model with selected features achieved an average accuracy of 92.09 % on the human-independent test set (the ratio of positives to negatives was about 1:11). Besides, we also evaluated the performance of SPAR on an independent yeast test set (the ratio of positives to negatives was about 1:8) and obtained an average accuracy of 76.96 %. The results demonstrate that SPAR is capable of achieving a reasonable performance in cross-species application. The SPAR server is freely available for academic use at http://systbio.cau.edu.cn/zzdlab/spar/.
WOS标题词Science & Technology ; Life Sciences & Biomedicine
类目[WOS]Biochemistry & Molecular Biology
研究领域[WOS]Biochemistry & Molecular Biology
关键词[WOS]INTERACTION NETWORKS ; WEB SERVER ; DATABASE ; SEQUENCE ; UPDATE ; SIMILARITY ; CURATION ; BIOLOGY ; DIMER
收录类别SCI
语种英语
WOS记录号WOS:000377409900011
内容类型期刊论文
源URL[http://124.16.173.210/handle/834782/2900]  
专题天津工业生物技术研究所_结构生物信息学和整合系统生物学实验室 宋江宁_期刊论文
作者单位1.China Agr Univ, Coll Biol Sci, State Key Lab Agrobiotechnol, Beijing 100193, Peoples R China
2.Monash Univ, Biomed Discovery Inst, Infect & Immun Program, Melbourne, Vic 3800, Australia
3.Monash Univ, Dept Biochem & Mol Biol, Melbourne, Vic 3800, Australia
4.Monash Univ, Fac Informat Technol, Monash Ctr Data Sci, Melbourne, Vic 3800, Australia
5.Chinese Acad Sci, Tianjin Inst Ind Biotechnol, Natl Engn Lab Ind Enzymes, Tianjin 300308, Peoples R China
6.Chinese Acad Sci, Tianjin Inst Ind Biotechnol, Key Lab Syst Microbial Biotechnol, Tianjin 300308, Peoples R China
推荐引用方式
GB/T 7714
Liu, Xuhan,Yang, Shiping,Li, Chen,et al. SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information[J]. AMINO ACIDS,2016,48(7):1655-1665.
APA Liu, Xuhan,Yang, Shiping,Li, Chen,Zhang, Ziding,&Song, Jiangning.(2016).SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information.AMINO ACIDS,48(7),1655-1665.
MLA Liu, Xuhan,et al."SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information".AMINO ACIDS 48.7(2016):1655-1665.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace