A distributed multiple sample testing for massive data | |
Xie Xiaoyue1,3; Shi Jian1,3; Song Kai2 | |
刊名 | JOURNAL OF APPLIED STATISTICS |
2021-04-08 | |
页码 | 19 |
关键词 | Distributed scheme hypothesis testing fraud detection classification |
ISSN号 | 0266-4763 |
DOI | 10.1080/02664763.2021.1911967 |
英文摘要 | When the data are stored in a distributed manner, direct application of traditional hypothesis testing procedures is often prohibitive due to communication costs and privacy concerns. This paper mainly develops and investigates a distributed two-node Kolmogorov-Smirnov hypothesis testing scheme, implemented by the divide-and-conquer strategy. In addition, this paper also provides a distributed fraud detection and a distribution-based classification for multi-node machines based on the proposed hypothesis testing scheme. The distributed fraud detection is to detect which node stores fraud data in multi-node machines and the distribution-based classification is to determine whether the multi-node distributions differ and classify different distributions. These methods can improve the accuracy of statistical inference in a distributed storage architecture. Furthermore, this paper verifies the feasibility of the proposed methods by simulation and real example studies. |
WOS研究方向 | Mathematics |
语种 | 英语 |
出版者 | TAYLOR & FRANCIS LTD |
WOS记录号 | WOS:000637242100001 |
内容类型 | 期刊论文 |
源URL | [http://ir.amss.ac.cn/handle/2S8OKBNM/58423] |
专题 | 中国科学院数学与系统科学研究院 |
通讯作者 | Shi Jian |
作者单位 | 1.Univ Chinese Acad Sci, Sch Math Sci, Beijing, Peoples R China 2.Beijing Inst Technol, Sch Management & Econ, Beijing, Peoples R China 3.Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China |
推荐引用方式 GB/T 7714 | Xie Xiaoyue,Shi Jian,Song Kai. A distributed multiple sample testing for massive data[J]. JOURNAL OF APPLIED STATISTICS,2021:19. |
APA | Xie Xiaoyue,Shi Jian,&Song Kai.(2021).A distributed multiple sample testing for massive data.JOURNAL OF APPLIED STATISTICS,19. |
MLA | Xie Xiaoyue,et al."A distributed multiple sample testing for massive data".JOURNAL OF APPLIED STATISTICS (2021):19. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论