CORC  > 计算机网络信息中心
An adaptive synchronous parallel strategy for distributed machine learning
Zhang, Jilin1,2,3,4,5; Tu, Hangdi1,2; Ren, Yongjian1,2; Wan, Jian1,2,4,5; Zhou, Li1,2; Li, Mingwei1,2; Wang, Jue6
刊名Ieee access
2018
卷号6页码:19222-19230
关键词Distributed machine learning Adaptive synchronous parallel Communication strategy Parameter server
ISSN号2169-3536
DOI10.1109/access.2018.2820899
通讯作者Ren, yongjian(yongjian.ren@hdu.edu.cn)
英文摘要In recent years, distributed systems have mainly been used to train machine learning (ml) models. however, as a result of the different performances among computational nodes in a distributed cluster and delays in network transmission, the accuracies and convergence rates of ml models are relatively low. therefore, it is necessary to design a reasonable strategy that provides dynamic communication optimization to improve the utilization of the cluster, accelerate the training times, and strengthen the accuracy of the training model. in this paper, we propose the adaptive synchronous parallel strategy for distributed ml. through the performance monitoring model, the synchronization strategy of each computational node with the parameter server is adjusted adaptively by considering the full performance of each node, thereby ensuring higher accuracy. furthermore, our strategy prevents the ml model from being affected by irrelevant tasks in the same cluster. experiments show that our strategy fully improves clustering performance, and it ensures the accuracy and convergence speed of the model, increases the model training speed, and has good expansibility.
WOS研究方向Computer Science ; Engineering ; Telecommunications
WOS类目Computer Science, Information Systems ; Engineering, Electrical & Electronic ; Telecommunications
语种英语
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号WOS:000430941500001
内容类型期刊论文
URI标识http://www.corc.org.cn/handle/1471x/2374232
专题计算机网络信息中心
通讯作者Ren, Yongjian
作者单位1.Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou 310018, Zhejiang, Peoples R China
2.Minist Educ, Key Lab Complex Syst Modeling & Simulat, Hangzhou 310018, Zhejiang, Peoples R China
3.Zhejiang Univ, Coll Elect Engn, Hangzhou 310058, Zhejiang, Peoples R China
4.Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Zhejiang, Peoples R China
5.Zhejiang Prov Engn Ctr Media Data Cloud Proc & An, Hangzhou 310018, Zhejiang, Peoples R China
6.Chinese Acad Sci, Supercomp Ctr Comp Network Informat Ctr, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Zhang, Jilin,Tu, Hangdi,Ren, Yongjian,et al. An adaptive synchronous parallel strategy for distributed machine learning[J]. Ieee access,2018,6:19222-19230.
APA Zhang, Jilin.,Tu, Hangdi.,Ren, Yongjian.,Wan, Jian.,Zhou, Li.,...&Wang, Jue.(2018).An adaptive synchronous parallel strategy for distributed machine learning.Ieee access,6,19222-19230.
MLA Zhang, Jilin,et al."An adaptive synchronous parallel strategy for distributed machine learning".Ieee access 6(2018):19222-19230.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace