TencentBoost: A Gradient Boosting Tree System with Parameter Server | |
Jiang, Jie ; Jiang, Jiawei ; Cui, Bin ; Zhang, Ce | |
2017 | |
英文摘要 | Gradient boosting tree (GBT), a widely used machine learning algorithm, achieves state-of-the-art performance in academia, industry, and data analytics competitions. Although existing scalable systems which implement GBT, such as XGBoost and MLlib, perform well for datasets with medium-dimensional features, they can suffer performance degradation for many industrial applications where the trained datasets contain high-dimensional features. The performance degradation derives from their inefficient mechanisms for model aggregation- either map-reduce or all-reduce. To address this high-dimensional problem, we propose a scalable execution plan using the parameter server architecture to facilitate the model aggregation. Further, we introduce a sparse-pull method and an efficient index structure to increase the processing speed. We implement a GBT system, namely TencentBoost, in the production cluster of Tencent Inc. The empirical results show that our system is 2-20x faster than existing platforms.; National Natural Science Foundation of China [61572039]; 973 program [2014CB340405]; Shenzhen Gov Research Project [JCYJ20151014093505032]; Tecent Research Grant (PKU); CPCI-S(ISTP); 281-284 |
语种 | 英语 |
出处 | IEEE 33rd International Conference on Data Engineering (ICDE) |
DOI标识 | 10.1109/ICDE.2017.87 |
内容类型 | 其他 |
源URL | [http://ir.pku.edu.cn/handle/20.500.11897/469897] |
专题 | 信息科学技术学院 软件与微电子学院 |
推荐引用方式 GB/T 7714 | Jiang, Jie,Jiang, Jiawei,Cui, Bin,et al. TencentBoost: A Gradient Boosting Tree System with Parameter Server. 2017-01-01. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论